Safe way to escape user input to be processed by regular expressions in JavaScript

Question:

The following example is published at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions :

function escapeRegExp(string) {
  return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); 
}

Is this the safe way to escape an end-user supplied string, for example via a dialog box?

Example:

 /** * Ejemplo. Eliminar todas las instancias de una subcadena en una cadena. * * Require dos entradas al usuario, cadena y subcadena. * Debemos asegurarnos que la subcadena es segura para ser procesada * como parte de una expresión regular. */ /** * Tomado de * https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions * ¿Es esto seguro? */ function escapeRegExp(string) { return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); } // Cadena a procesar var cadena = "Test abc test test abc test test test abc test test abc"; var entradaUsuario1 = prompt("Escribe la cadena a procesar",cadena); // Subcadena a eliminar var subcadena = "abc"; var entradaUsuario2 = prompt("Escribe la subcadena a eliminar",subcadena); // Aplicar la función para escapar la entrada de usuario var re = new RegExp(escapeRegExp(entradaUsuario2),'g'); // Aplicar reemplazo var resultado = entradaUsuario1.replace(re, ''); // Imprimir en la consola el resultado console.log(resultado);

Answer:

It is a safe way, but characters are leaking too much.

  • The ] only has a special meaning within a character class (closing it). But if we are already escaping the [ , there could be no class inside the regex.
  • The } only has a special meaning as the end of the range quantifier {m,n} . And again, if we're escaping the { , there couldn't be a quantifier of this style inside the regex.

Escape metacharacters

The metacharacters (or special characters) are exclusively:

\   ^   $   .   |   ?   *   +   (   )   [   {

The simplified function:

function escaparRegex(string) {
    return string.replace(/[\\^$.|?*+()[{]/g, '\\$&'); 
}

Escaping metacharacters in a character class

It may be the case in which you want to add characters within a character class (between brackets), for example in

var re = new RegExp("\\S+{2,} [" + caracteres + "]{3,}")

In that case, they must be escaped:

^ (al principio)   \   ]   -

The function to escape the content of a character class:

function escaparClaseRegex(string) {
    return string.replace(/^\^|[-\]\\]/g, '\\$&');
}

Escape metacharacters in replacement text

When using cadena.replace(re, reemplazo) , there are some replacement patterns that have special meaning. To ensure that it is being replaced by the literal value, the $ should be escaped as $$ to:

$$   $&   $`   $'   $n (n es un dígito)

The function to escape the replacement text:

function escaparReemplazoRegex(string) {
    return string.replace(/\$(?=[$&`'\d])/g, '$$$$');
}
Scroll to Top