c# – My implementation of a long running task with progress and its weaknesses

Question:

Situation: There is an asp.net mvc application. There is a button by clicking on which the process of generating a certain report starts. The report can be generated for quite a long time, for example, several minutes. After starting the generation, it is also necessary to show the user a modal window with the progress of the generation and the possibility of canceling. I implemented it like this.

private static readonly IList<DownloadTask> DownloadStates = new List<DownloadTask>();

public ActionResult Export()
{           
    var task = new DownloadTask(Guid.NewGuid());
    DownloadStates.Add(task);

    string zipFilename = string.Format("{0}.zip", task.Id);
    GenerateAsync(zipFilename, task);

    return Json(task.Id, JsonRequestBehavior.AllowGet);
}

This action is called when the report generation button is clicked. Here's what's going on. First, a task is created. The code for this class looks like this

public class DownloadTask
{
    public DownloadTask() { }

    public DownloadTask(Guid id)
    {
        Id = id;
        State = DownloadState.NotStarted;
        CancellationTokenSource = new CancellationTokenSource();
    }

    public Guid Id { get; set; }

    public double CurrentProgress { get; set; }

    // состояние задачи - запущена, отменена, ошибка, завершена и тд 
    public DownloadState State { get; set; } 
}

The task is placed in the DownloadStates collection.

This is how the code of the asynchronous method GenerateAsync looks like

public async Task<Guid> GenerateAsync(string filename, DownloadTask currentTask)
{   
    await Task.Factory.StartNew(() =>
    {
        try
        {
            currentTask.State = DownloadState.Processing;

            foreach (var page in pages)
            {
                // обновление прогресса
                currentTask.CurrentProgress = // ...
                // обработка данных
            }
            currentTask.State = DownloadState.Completed;
        }       
        catch
        {
            currentTask.State = DownloadState.Faulted;
            // обработка ошибки
        }
    }).ConfigureAwait(false);
    return currentTask.Id;
}

That is, by clicking on the button, we send a request to generate a report. A task is created, given a unique Guid, the task is placed in the DownloadStates task collection, and then the GenerateAsync asynchronous method is run. After running this method, the server sends the guid of the created task as a response to the client. An asynchronous method, which is important, is launched without await according to the fire and forget principle. Further communication with it by the generation process occurs through the DownloadStates collection.

After that, on the client, when the server response with the task identifier comes, the setInterval function is launched in which a request is made to the server every 200 ms to find out the progress of the task and display this progress in a modal window. Here is the code of the action responsible for getting the current progress

public ActionResult Progress(Guid taskId)
{
    var task = DownloadStates.FirstOrDefault(x => x.Id == taskId);
    string state = "";
    bool isFaulted = false;
    bool isCancelled = false;

    if (task != null)
    {
        switch (task.State)
        {                           
            case DownloadState.Faulted:
                isFaulted = true;
                DownloadStates.Remove(task);
                break;
            case DownloadState.Processing:
                state = "processing";
                break;
            // ... прочие ветки     
        }
    }
    else
        isFaulted = true;

    return Json(new
    {
        progress = task != null && task.State != DownloadState.Completed ? task.CurrentProgress * 100 : 100,
        text = state, 
        isFaulted,
        isCancelled
    }, JsonRequestBehavior.AllowGet);
}

The following happens here: if the task is executed, then in the response we return the current progress to the client (it is stored in the task in the DownloadStates collection and is updated inside the asynchronous method). If some exception occurs during the execution of the asynchronous method, then isFaulted = true is returned to the client and the client, having received this value, stops the timer and stops sending progress requests (the task ended with an error) and displays an error message. If the task has the Completed status, then the progress is set to 100% and the timer on the client also stops, after which a request is made to download the generated report.

In a nutshell, it works the way I described it. This scheme really works, but I do not have much experience with writing such asynchronous code and therefore I am tormented by doubts whether I did everything right? For example, when calling the asynchronous GenerateAsync method, I do not call await so as not to wait for its completion, but to immediately return the identifier of the new task to the client. Therefore, I can find out about the error only by setting the isFaulted property of my tasks in the collection stored in the controller and reading this property when receiving the current progress.

I would like to hear the opinion of more experienced programmers about potential problems in my code (memory leaks, unfinished tasks, unhandled exceptions, maybe I should have put a lock somewhere or could there be locks or something else?). Are there any weak points here (and I'm sure there are), which should be improved here, especially given multi-threaded execution and the potential launch of several tasks at the same time. Thanks in advance!

PS Unfortunately, the use of SignalR turned out to be impossible for reasons beyond my control, so I can’t consider the option to remake everything on SignalR

Answer:

Main problems

  • No lock on calls to DownloadStates. The List class is not thread safe and must be manually synchronized.
  • Task.Factory.StartNew does not necessarily start a task on a new thread. You should use Task.Factory.Run .
  • The only more or less reliable method for performing background tasks in ASP.NET is QueueBackgroundWorkItem .
  • Your approach is not very applicable in live applications, in the presence of two or more web servers, and, accordingly, two copies of the application.

It is much more reliable not to run background tasks in ASP.NET itself, but to put tasks in the database, and make a win-service that will get them from there, execute them, and report progress through the database.


More:

List Synchronization

List works on simple arrays internally. Those. the search for FirstOrDefault in it is made by the usual enumeration of elements by index from 0 to N-1.

Suppose you have started two tasks – 0 and 1.

  1. Task 0 has completed.
  2. I received a request for Progress for Task 0. I got to the line with Remove
  3. I received a request for Progress for Task 1. I went to the FirstOrDefault search, reached index 1
  4. The request for Progress for Task 0 took and removed its item from the list!
  5. Does the FirstOrDefault… call fail in Progress for Task 1? returns null?

QueueBackgroundWorkItem

ASP.NET has an applepool raycycle mechanism. At some point, IIS can pick up and restart the process that is running the application. By default, this happens once every 29 hours. May occur due to memory limit. Or inactivity. A regular task from the Thread Pool or any other background thread will simply be killed on recycle, instantly and without any notification.

A task created via QueueBackgroundWorkItem will receive a shutdown notification (via CancellationToken). In addition, he will have 90 seconds to respond to this notification. Not super reliable, but better than nothing.

Problems with two web servers

If you have two backends, then you have two independent copies of the application. Two lists of DownloadStates. If the request to start the task came to one server, and the progress request came to another, then the user will receive isFaulted.

It is solved by transferring the download queue to the database or to any other shared storage (for example, a service with a queue).

Scroll to Top