Question:
I iterate over the lines in the file, each counting the number of letters "a", I do it asynchronously and multithreaded. Why in a multithreaded method the number of letters "a" is less than in an asynchronous or simply sequential method. That is, if we iterate over the lines and count the letters a, then there will be 65913 sequentially, 65913 asynchronously, and less multithreading, for example 65909. Maybe some thread prevents another thread from completing? What is the problem and how to solve it? ASYNCHRONOUS
public static async Task<int> ReadFileAsync(string path)
{
int count = 0;
using (StreamReader sr = File.OpenText(path))
{
string line = String.Empty;
int i = 0;
while ((line = await sr.ReadLineAsync()) != null)
{
if (!String.IsNullOrEmpty(line))
{
i++;
MatchCollection matches3 = Regex.Matches(line, "а");
count += matches3.Count;
Console.WriteLine($"in line num {i} count symbol a equal: {matches3.Count}");
}
}
}
return count;
}
MULTI-THREAD
public static void ReadFileThread(string path)
{
StreamReader sr = File.OpenText(path);
string line = String.Empty;
while ((line = sr.ReadLine()) != null)
{
if (!String.IsNullOrEmpty(line))
{
Thread thread = new Thread(new ParameterizedThreadStart(WorkWithLines));
thread.Start(line);
}
}
}
private static void WorkWithLines(object line)
{
MatchCollection matches3 = Regex.Matches(line.ToString(), "а");
Console.WriteLine($"in line num count symbol a equal: {matches3.Count}");
CountForThirdMethod += matches3.Count;
}
Answer:
CountForThirdMethod += matches3.Count;
+=
not an atomic operation. It breaks in two
var newCountForThirdMethod = CountForThirdMethod + matches3.Count;
CountForThirdMethod = newCountForThirdMethod;
When executing in multiple threads, there may be an overlap of the form
// CountForThirdMethod == 0
Thread1: var newCountForThirdMethod = CountForThirdMethod + matches3.Count; // 1
Thread2: var newCountForThirdMethod = CountForThirdMethod + matches3.Count; // 1
Thread1: CountForThirdMethod = newCountForThirdMethod; // 1
Thread2: CountForThirdMethod = newCountForThirdMethod; // 1
those. like both threads did +=
, but one of the results was lost. Enclose this line with a lock
, or use Interlocked.Add
But in general, you don't seem to need multithreading, most likely the bottleneck will still be reading from a file.