Question:
I have this very old function to "clean" the contents of a variable:
Occupation
function sanitizeString($string) {
// matriz de entrada
$what = array( 'ä','ã','à','á','â','ê','ë','è','é','ï','ì','í','ö','õ','ò','ó','ô','ü','ù','ú','û','À','Á','É','Í','Ó','Ú','ñ','Ñ','ç','Ç',' ','-','(',')',',',';',':','|','!','"','#','$','%','&','/','=','?','~','^','>','<','ª','º' );
// matriz de saída
$by = array( 'a','a','a','a','a','e','e','e','e','i','i','i','o','o','o','o','o','u','u','u','u','A','A','E','I','O','U','n','n','c','C','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_','_' );
// devolver a string
return str_replace($what, $by, $string);
}
Use
<?php
$pessoa = 'João dos Santos Videira';
$pastaPessoal = sanitizeString($pessoa);
// resultado
echo $pastaPessoal; // Joao_dos_Santos_Videira
?>
Being an old function, at the time of its creation making a substitution of a character A for B was the best option, but maintaining an input matrix and an output matrix is not easy and every now and then there appears a scenario not foreseen.
With the evolution of PHP, how to refactor this function using solutions from the language itself or which are easier to maintain?
Answer:
Just use regular expressions!
<?php
function sanitizeString($str) {
$str = preg_replace('/[áàãâä]/ui', 'a', $str);
$str = preg_replace('/[éèêë]/ui', 'e', $str);
$str = preg_replace('/[íìîï]/ui', 'i', $str);
$str = preg_replace('/[óòõôö]/ui', 'o', $str);
$str = preg_replace('/[úùûü]/ui', 'u', $str);
$str = preg_replace('/[ç]/ui', 'c', $str);
// $str = preg_replace('/[,(),;:|!"#$%&/=?~^><ªº-]/', '_', $str);
$str = preg_replace('/[^a-z0-9]/i', '_', $str);
$str = preg_replace('/_+/', '_', $str); // ideia do Bacco :)
return $str;
}
?>
The line of code below the comment is to replace all characters with "_", except if they are letters or numbers.