c# – What is the correct way to do a "Replace()" on a string variable?

Question:

I need to create a folder on the file server and I noticed that the variable that receives one of the information is coming with invalid characters ( \ / : * ? " < > | ) for creating folder in Windows.

string Caminho = Path.Combine(PastaPadrao,txtInfoUser.Text);

In the case of the character (:), it is relevant to the user and therefore needs to be recorded, but nothing prevents me from doing a Replace() for underline , but it raised the question: What is the correct way to do Replace() in that case?

return Caminho.Replace(@"\", "_")
               .Replace(@"/", "_")
               .Replace(@":", "_")
               .Replace(@"*", "_")
               .Replace(@"?", "_")
               .Replace(@"""", "_")
               .Replace(@"<", "_")
               .Replace(@">", "_")
               .Replace(@"|", "_");

If I do it this way, will 9 different instances of the string be created?

For now, I did it like this:

 StringBuilder CaminhoSemCaracteresInvalidos = new StringBuilder(Caminho);
CaminhoSemCaracteresInvalidos.Replace(@"\", "_")
                .Replace(@"/", "_")
                .Replace(@":", "_")
                .Replace(@"*", "_")
                .Replace(@"?", "_")
                .Replace(@"""", "_")
                .Replace(@"<", "_")
                .Replace(@">", "_")
                .Replace(@"|", "_");

return CaminhoSemCaracteresInvalidos.ToString();

My concern is to try to do it in the best way so that the application doesn't suffer at the end of the day, always trying to do what's right from the point of view of performance and good practice, but I also don't know if I'm building an atomic bomb to kill a fly.

Answer:

If I do it this way, will 9 different instances of the string be created?

That.

The second way seems more suitable because it doesn't make the allocations and allocation costs more than people think * , if you do it in a loop with several paths it can put a lot of pressure on the gargabe collector and give it pauses.

The StringBuiler allows you to change the string and its Replace() is inplace , keeping only one allocation besides the original and then another for the final result. I think the Caminho could be created in it, so it already saves one more allocation.

But I wonder if it's not worth making an algorithm that scans every string (default or builder , the latter is easier to do with performance) and creates a new one changing what you need in one step. I explain.

The second form still has the problem that it will have 10 loops ( ToString() also has one) running across the entire text. You're not seeing it but it's running. So if you make a loop that handles all characters it can give a much faster result. The other answer tried to do this but as it kept using Replace() it created an exponential problem which is everything a developer should avoid (only no apparent problem at low volume).

There's a way to do it with RegEx too, but in all the cases I've seen if you do it by hand it's faster (I've seen people saying that RegEx can be faster, but I've never seen it happen when compared to manual code done correctly). I wouldn't even think about it unless I'm fanatical about this kind of solution (I think it's also unreadable, but I like it).

The total gain can be brutal.

Often the person is not worried about it because they will use it little, then years later that person, or another person who does not even work on the team today, will use it more intensively and will have something slow. Then nobody knows what it is, and they get desperate with the slowness and only look at the new code when the error is old.

But everyone knows where the callus hurts, so you need to analyze if it's worth the effort. For me it's always worth it unless it's a lot of work and I know that the gain really isn't necessary (this rarely happens to me).


* Allocation itself is cheap, but putting pressure on the GC makes it spring into action more often and for longer, and worse, causing other objects to switch generations prematurely. It's not easy to see this happening, it's something that goes on slowly, it's more or less silent (some are quite deaf and don't hear even when there's a small noise), so if you do a naive test it seems fast.

Scroll to Top