Async Code in Dotnet
Why a deeper learning of TPL is worth your time and not always the best choice
Threading is hard, so much so that Microsoft created the Task Parallel Library (TPL) to make it a bit easier. By now you’ve likely encountered the await/async
keywords. The rules are simple, whenever something returns a Task<T>
(or Task
), you just need to add the await
keyword to get the method to execute (optionally returning a value).
If you stopped your learning there with async/await, you’d be missing out on a lot of the necessary insight of the underlying code and possibly making your code slower that using sync code. Here’s an excerpt from Microsoft about TPL:
…if a loop performs only a small amount of work on each iteration, or it doesn't run for many iterations, then the overhead of parallelization can cause the code to run more slowly. Furthermore, parallelization, like any multithreaded code, adds complexity to your program execution. Although the TPL simplifies multithreaded scenarios, we recommend that you have a basic understanding of threading concepts, for example, locks, deadlocks, and race conditions, so that you can use the TPL effectively.
Let’s look at some of the basics of TPL next.
What exactly is TPL?
TPL is Microsoft’s wrapper around making code asynchronous. However, ask yourself; why should we even make any code async? Every line of code in your app is executed by a thread and a system has a limited amount of threads to use for code execution. Therefore we should treat them as a valuable resource. What we want is the threads to not be waiting on an external process to complete before continuing on. We want our threads always doing something so long there is work to be done. In order to maximize threads, threads must be “released” from waiting and doing nothing. When the external system we were waiting on finally responds, a thread must continue the work until completion of the work.
You can of course manage your threads yourself but this would end of up a bunch of boilerplate code in every app you ever write leading to an eventual Nuget package you’d include in every project. Fortunately Microsoft has made async coding a first-class paradigm in Dotnet and they called it TPL.
When should you use sync vs async code?
Using async code has overhead and shouldn’t be used in all cases. When the await
keyword is added in your code, the compiler does “special” things to your code. It essentially wraps your code with generated code to facilitate (automatically) the release of the thread and the continuation later on. It also bakes in mechanisms to cancel and to eventually handle timeouts if the foreign system does not respond in a timely manner.
So you should not use async code for everything as synchronous code does not have the extra overhead. You must be sure that the code you are writing will benefit from asynchronous management. The following list illustrates two classic examples of when to use async code instead of sync:
You are talking to an external system. This is typically an HTTP call, talking to your DB, queuing a message, receiving a message, etc.
You need to parallelize the work. Synchronous code is a single thread doing work in a series. By parallelizing things, an app can leverage several threads to do the work (more on this later). Parallel code has several gotcha’s that will surely make their presence known the first time you dive in.
For most other cases, synchronous code is what you want to be using and is therefore the default. For me it’s easier to think of threads as humans, let’s first look at sync code with human threads:
Fred is our worker thread.
He is given work to do and does it.
Fred needs another department to grab some information and makes a phone call to a provider.
Fred waits on the line and doesn’t do any work until the information is provided to him.
Fred then completes his task.
As you can see, Fred had to wait on something else and was not productive during that time.
In async code, the flow would have happened like the following:
Fred is our worker thread.
He is given work to do and does it.
Fred needs another department to grab some information and makes a phone call to a provider.
Fred goes back and gets more work while the person on the phone is getting information.
When the information is ready, Sally comes along and accepts the information from the provider.
Sally then completes the task.
Async code is not magical, in Dotnet TPL is a means to make sure your thread pool is maximally utilized.
Synchronization Context
You may not realize it, but ASP.NET uses parallelization for web requests. You can think of it (loosely) as one thread-per-request. This is true in a synchronous context, however when we introduce async (or deliberate threading), we have to aware that multiple threads will be used to fulfill the request. Let’s look at a synchronous example again but in a web context:
A web request is received.
ASP.NET uses a single thread to handle the entire request and returns a result. Any User information is stored on the thread as context so that other classes may inquire who the user is.
The work is complete.
Please note the example above is unrealistic as web requests will end up using an unknown number of threads as we see next.
Nothing mind-bending so far, however let’s look at what happens if any async code is used during that request:
A web request is received.
ASP.NET uses a single thread to handle the beginning of the request.
Any User information is stored on the thread as context so that services may inquire who the user is.
The execution thread (the original one) encounters an async block of code. The original thread is released and is able to serve another bit of work. The “context” of the thread is stored away (includes User information).
The async code is likely making a call to an external system which is why there’s little incentive to hold our thread and wait. The waiting mechanism of TPL can decide to timeout if it decides the wait has been too long.
When the execution resumes (let’s say the async block queried the DB), ASP.NET grabs a free thread from the pool and continues the code after the await keyword. The thread is given the context that was on the original thread but we’ve effectively changed threads. You should bet good money you’ll get a different thread than what was originally used.
If we hit another async block, repeat steps 4-6.
When complete, the last thread returns the result and the work is completed.
The takeaway here is that in an async environment, we cannot make any guarantees about which threads will be used after an async block is encountered. There is a slim chance that the original thread picks up the async block is used to finish the work. To pass on information from one thread to another, Microsoft has introduced the Synchronization Context. Developers often confuse themselves and assume they can use the original thread after the async block — that is not possible. Rather the only thing that gets passed on is the “context”. In this way we can know who the User is during the entire web request and not worry about which threads did the actual work.
This differs in a Console App sync context. In a console app there is no context being passed on to threads. This is a fundamental thing to know about TPL — app-type matters as each one can have their own Synchronization Context. ASP.NET just so happened to include a special one for web apps.
Dependency Injection
Remember above when I said a web request can be thought of as a single thread? That’s true if you have only synchronous code, as you can see with TPL/async, you can have many threads per web-request. This is why when using dependency injection (DI), we need to have a “scope” that includes all threads used during a request. DI container makers are aware of this and often provide a ".Scoped()
” lifecycle or a “Per-Request” lifecycle.
Entity Framework/NHibernate/etc are not thread-safe and scoping let’s us use those non-thread-safe classes with TPL/async. It would be really hard to have to keep track of threads without a DI container.
Async sprawl
Async is useful and worth it in the correct scenarios. However there is something known as “async sprawl”. This is a bit of an insidious bit of code where everything in the call chain needs to be marked as async and return Task<T>
(or Task
). The call chain typically starts at the Main
class or the controller in web contexts. Retro-fitting systems can involve a lot more changes than you bargained for. Avoid the urge to try to do hacks to get around this, it will just leave you with very hard to debug problems.
Blocking Code
It doesn’t make a lot of sense to use async only to block a thread. A blocking call means you’ve prevented the thread from being “dismissed” from the call and forcing it to wait. If you’re going to block a thread (on purpose), then you may as well use synchronous code to avoid the overhead async is incurring. It’s best to not block threads and let Dotnet manage them efficiently for you. Blocking threads means your system may grind to a halt with deadlocking. If you find yourself using code like the following, you’re likely blocking the threads:
.Wait()
.WaitAll()
(use.WhenAll()
instead).Result()
Gotcha’s
There are a few gotcha’s in code using async. Learn from my (and other’s) past mistakes:
Don’t return
void
in anasync
method, instead just returnTask
.Don’t use hacks you’ll find on the internet to make async code run synchronously. These often involve using
.Result
,.GetAwaiter().GetResult()
, etc. While this work in a lot of cases, you are flirting with causing deadlocks..Net FW has situations where they themselves don’t support async. I encountered this in an
Authorize
attribute. Try to avoid hacks by making a sync equivalent if possible.When using fire-and-forget (e.g.
Task.Run(() => {}
), keep in mind that you basically spawned an orphaned thread and if it that thread throws an exception, the outer parent context won’t be able to catch it. Also your app/request will likely end before the thread completes creating interesting scenarios.When using TPL, you may not realize you’re working with non-thread-safe collections. For instance, the simple
List<T>
is not thread-safe and you may need to use things like aConcurrentBag<T>
when multiple threads might add to a collection.Parallelized code may behave in different ways due to race conditions, make sure your code can handle any thread reaching conclusion before another. You will wanna know about the
lock()
mechanism of synchronous code and SemaphoreSlim in async code.If you don’t need the context to be synchronized after the
await
, you can configure theTask
to use.ConfigureAwait(false)
which means we don’t have to worry about getting information from the first thread.Don’t stick any information directly on a thread thinking you’ll get it back. For instance if you use
ThreadLocal
, you may be in for a surprise.
The End
Hopefully you have a lil more understanding of how async/threading/TPL work with ASP.NET and consoles. TPL is a very robust library that needs a lil context to make the most of it.
Happy coding!