c# – Multithreading problem

Question:

I iterate over the lines in the file, each counting the number of letters "a", I do it asynchronously and multithreaded. Why in a multithreaded method the number of letters "a" is less than in an asynchronous or simply sequential method. That is, if we iterate over the lines and count the letters a, then there will be 65913 sequentially, 65913 asynchronously, and less multithreading, for example 65909. Maybe some thread prevents another thread from completing? What is the problem and how to solve it? ASYNCHRONOUS

public static async Task<int> ReadFileAsync(string path)
{
    int count = 0;
    using (StreamReader sr =  File.OpenText(path))
    {
        string line = String.Empty;
        int i = 0;
        while ((line = await sr.ReadLineAsync()) != null)
        {
            if (!String.IsNullOrEmpty(line))
            {
                i++;
                MatchCollection matches3 = Regex.Matches(line, "а");
                count += matches3.Count;
                Console.WriteLine($"in line num {i} count symbol a equal: {matches3.Count}");
            }
        }
    }
    return count;
}

MULTI-THREAD

public static void ReadFileThread(string path)
{
    StreamReader sr = File.OpenText(path);
    string line = String.Empty;

    while ((line = sr.ReadLine()) != null)
    {
        if (!String.IsNullOrEmpty(line))
        {
            Thread thread = new Thread(new ParameterizedThreadStart(WorkWithLines));
            thread.Start(line);
        }

    }
}
private static void WorkWithLines(object line)
{
    MatchCollection matches3 = Regex.Matches(line.ToString(), "а");
    Console.WriteLine($"in line num count symbol a equal: {matches3.Count}");

    CountForThirdMethod += matches3.Count;
}

Answer:

CountForThirdMethod += matches3.Count;

+= not an atomic operation. It breaks in two

var newCountForThirdMethod = CountForThirdMethod + matches3.Count;
CountForThirdMethod = newCountForThirdMethod;

When executing in multiple threads, there may be an overlap of the form

// CountForThirdMethod == 0
Thread1: var newCountForThirdMethod = CountForThirdMethod + matches3.Count; // 1
Thread2: var newCountForThirdMethod = CountForThirdMethod + matches3.Count; // 1
Thread1: CountForThirdMethod = newCountForThirdMethod; // 1
Thread2: CountForThirdMethod = newCountForThirdMethod; // 1

those. like both threads did += , but one of the results was lost. Enclose this line with a lock , or use Interlocked.Add

But in general, you don't seem to need multithreading, most likely the bottleneck will still be reading from a file.

Scroll to Top