c# – How to programmatically escape control characters in a string?

Question:

I need to insert into a regular expression pattern:

$@"^(?:[^\p{{L}}]|[{exclusion}])+$" //Цель: Запретить использование каких либо букв в строке, кроме тех что заданы в переменной - exclusion

string variable:

string exclusion;

in which all control characters would be escaped, which would avoid errors associated with the operation of the regular expression.

I found the Regex.Escape() method. But it doesn't suit my needs. For example, if exclusion = @"[text]" passed to the Regex.Escape() method, it will return the string "\\[text]" . After inserting this line into the pattern, instead of the exclusion variable:

$@"^(?:[^\p{{L}}]|[{exclusion}])+$" //Цель: запретить использование каких либо букв в строке, кроме тех что заданы в переменной - exclusion

it takes the following form:

$@"^(?:[^\p{{L}}]|[\[text]])+$"

As a result, the regular expression does not work correctly. I suspect that the reason is the extra character – ]

Can you please tell me how to escape all control characters in a string? Is there any other way besides the Regex.Escape() method? Maybe I somehow used it incorrectly and do not notice my mistake?

Answer:

Regex.Escape escapes those characters that are considered special outside of character classes:

Escapes a minimal set of characters ( \ , * , + , ? , | , { , [ , ( , ) , ^ , $ , . , # , and white space) by replacing them with their escape codes. ( Escapes the minimum set of characters ( \ , * , + , ? , | , { , [ , ( , ) , ^ , $ , . , # and whitespace) by replacing them with their escape codes )

In fact, only the following characters are considered special within character classes:

  • ^ – can mean an exclusive type of a character class if it is immediately after the opening [
  • ] – closes the character class
  • \ – escapes special characters
  • - – specifies a range of characters or "character class subtraction"

To escape these characters, it is enough to use

exclusion.Replace("\\", @"\\").Replace("^", @"\^").Replace("-", @"\-").Replace("]", @"\]")

or

Regex.Replace(exclusion, @"[]^\\-]", "\\$&")

Solution:

var pattern = $@"^(?:[^\p{{L}}]|[{Regex.Replace(exclusion, @"[]^\\-]", "\\$&")}])+$";

Or (since [^\p{L}] = \P{L} ):

var pattern2 = $@"^[\P{{L}}{Regex.Replace(exclusion, @"[]^\\-]", "\\$&")}]+$";
Scroll to Top