Understanding Asyncio

Asynchronous programming in python is among the most interesting but also the most daunting of things to learn. Not only in python, but generally across languages that support the paradigm. Personally, it took me about a year (the whole of last year) trying to learn and re-learn the various concepts that were involved, and I remember how frustrated I was throughout the journey. My hope is that I can help shorten your study cycle by trying to summarize the main concepts that I learned along the way and try to make them more palatable for you to understand. Note that this is not an asyncio tutorial but helps you understand its need and its implementation. I try to provide links to the relevant source that I think will help you master asyncio more.

Introduction

According to Wikipedia

Asynchrony, in computer programming, refers to the occurrence of events independent of the main program flow and ways to deal with such events.

In simple terms, this means that the program executes tasks according to which task is ready to run and not following a strict order of tasks. To understand this better, let's use an analogy of the school cook serving lunch to the students that have lined up in a queue. The lunch menu has rice, matooke (bananas), fish, beef and Gnut sauce. The students can pick whatever they wish from the menu. If one student being served by the cook could not make up their mind on if to have the fish or the beef, then they would be wasting time in the queue and blocking the other students from being served. To unblock the queue, the other students need to be allowed to cut in front of the thinking student who is still trying to make up his mind so that they can be served by the cook and later the cook would serve the thinking student after he's made up his mind. This is the same case when it comes to software.

The main reason for asynchronous programming is to speed up the execution of a program or programs by allowing the computer to run multiple execution contexts at the same time and creating non-blocking execution environments. The execution contexts can be an operating system process (OS process), operating system thread (OS thread) or green thread

Threads and Processes

Again Wikipedia defines processes as

In computing, a process is the instance of a computer program that is being executed by one or many threads.

Basically, a process is a uniquely running program. A single program can be run in many processes, just like one can have many windows of their text editor open. Depending on the operating system, a process can also have the ability to start other processes of different programs or even launch child processes from itself. In python, we can work with multiple processes using the standard library multiprocessing module.

Depending on the operating system, programs are implemented as a collection of threads. According to Wikipedia

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system

Threads are like internal virtual machines inside a program that run tasks. Theoretically, a program can start as many threads as it wishes. In practice, the number of threads is limited by the memory available.

Both processes and threads were designed to solve the problem at hand, which is to enable us to run multiple tasks at the same time. However, they both come at an expensive cost. First of all, they both take up a huge amount of space and thus limiting the number of execution contexts we can have. In the case of a web server, if we were to start a new process or new thread for every client that connected to our server, the performance of the server will start slowing down by the time we reach 2000 processes or threads on some operating systems since memory would be eaten up immensely. Also, the barrier between process communication exerts a heavy toll on them since there's an overhead for 2 processes to communicate. And for threads however much they do not have the communication barrier like processes, the context switching between threads introduces a significant overhead. context switching is basically what the operating system does while juggling multiple threads on a single CPU or core. It has to save the state of the currently running thread in memory and then load a new thread to run for some time, then again save that thread's context, stop it and load a new thread and keep switching between them like that. All of this has significant overhead.

And when it gets to multithreading in python, it is heavily impacted by the global interpreter lock (GIL). The GIL only allows one python thread to run at a time, which hinders optimizations that might be provided by modern operating systems for multithreaded programs. And since one python thread can be run at a time, it means python will cut short a thread right in the middle of its important execution such as validating a user or reading from a database. This is called preemptive multithreading and it's the cause of expensive context switches. It also might lead to some software faults if not handled carefully

Green threads

Green threads, or proto threads as they might be known in other terms, are execution contexts that are implemented in a language's runtime and try to mimic the working of OS-level threads. They run in the same process and within the same thread of execution provided by the OS. Due to their extremely lightweight nature, hundreds of thousands of green threads can be typically run by a language runtime.

Implementation

To implement green threads in any language runtime, we need to first know how the operating system handles its threads, why they're so expensive and how we can implement lightweight versions in our runtime. First of all, we've already outlined the fact that every OS thread takes up a significant amount of memory to save its context, and also the OS Scheduler takes a lot of time switching between various threads. There are more issues caused by synchronization techniques such as mutexes and semaphores which cause lock contention, lock starvation and other disadvantages of using locks

Knowing that, we go ahead to design our green threads with two major goals in mind, that is to say, they should be very memory efficient hence lightweight, and have fast context switching. Because green threads are implemented with memory-efficient language machinery we can have hundreds of thousands of them run in a single process. And given the fact that they run in the same process and same thread, we eliminate the need for context switching by the OS and handle it ourselves, which makes it extremely fast.

Green threads in python

Before the advent of asyncio, significant efforts in the python world were done to create libraries that provide green threads. One of the major pioneers was greenlet which implemented an industry-used framework to work with green threads. The twisted project came in later which also introduced an easy-to-use industry framework to build web servers and services using python green threads. coming to today, lots of frameworks have been written to help use python green threads, and asyncio has created the standard for working with these green threads.

Implementing python green threads with generators

So let us walk the journey of the evolution of green threads in python and how they were implemented over time up to this day.

First of all, we need to ask ourselves how we can implement green threads. From what we had mentioned earlier in the previous section, green threads should be able to mimic OS threads by having the ability to run a task and also allowing for context switching since all the green threads should share a single process to execute. One significant feature of the python language has qualities that might help us implement this, and those are generators. Python generators were first standardized in PEP 255: Simple Generators as early as the year 2000 and they provide a way of creating iterators that can be executed on demand. This implicitly means that generators can run a task since they are functions in themselves, and also can save their state to be resumed later. So they can be paused and resumed, a key feature to help us build green threads since we need our execution environment to pause a running task and resume it later without impact.

Generators can also not be paused by the outside world but can only be paused from within after yielding their value. This means that the green threads we write tell the execution environment when to be paused and not the execution environment abruptly deciding when to stop the thread. So if the green is in the middle of an important database transaction or memory access, it can run without interruption until it feels safe to yield its control back to the execution environment. This is called cooperative multitasking.

Alright, enough with the jibberish. Let's try to implement a green thread system in python. To implement the green thread system, we need two components: The green thread itself and the executor running the green threads in a loop. The executor is commonly known as the event loop. But before we dig deep into the implementation, let us first have a brief look at generators in python.

# A simple generator that yields values from a list of numbers
nums = [ num for num in range(20)]

def num_generator (num_list):
    index = 0                     #index of elements in the number list
    max_index = len(num_list) - 1    #the maximum index of elements in a list
    while (index <= max_index):
        temp_index = index        #hold the current index
        index += 1                #increment to next index
        yield num_list[temp_index]#yield value at current index

def printer ():
    for num in num_generator(nums):
        print(num)

printer()

The above example shows us the basic structure of a typical generator. Any functional generator should have a loop and a yield statement. Our num_generator generator function yields one number at a time when iterated over in the printer function. If we wanted to have more control of the execution of the generator, we would use the built-in next function which tells the generator to yield the next value and pause.

# A simple generator that yields values from a list of numbers
nums = [ num for num in range(20)]

def num_generator (num_list):
    index = 0                     #index of elements in the number list
    max_index = len(num_list) - 1    #the maximum index of elements in a list
    while (index <= max_index):
        temp_index = index        #hold the current index
        index += 1                #increment to next index
        yield num_list[temp_index]#yield value at current index

def printer ():
    max_times = 10        #maximum number of times to call next
    counter = 0
    num_gen = num_generator(nums)
    while (counter <= max_times):
        num = next(num_gen)
        counter += 1
        print(num)

printer()

We've updated the code in the printer function to call next on the num_gen generator and terminate when the counter is greater than max_times.

Let us now have three instances of the generator function become our threads and then create an executor to run them.

# A simple generator that yields values from a list of numbers
nums = [ num for num in range(20)]

def num_generator (num_list):
    index = 0                     #index of elements in the number list
    max_index = len(num_list) - 1    #the maximum index of elements in a list
    while (index <= max_index):
        temp_index = index        #hold the current index
        index += 1                #increment to next index
        yield num_list[temp_index]#yield value at current index

def executor ():
    # create a list of three green threads
    green_threads = [num_generator(nums) for _ in range(3)]
    while len(green_threads):        # while green threads still in list
        for thread in green_threads: # execute each thread in list
            num = next(thread)
            if (num >= 10):
                green_threads.remove(thread)
            else:
                print(num)

executor()

In the above code snippet, the threads in the green_thread list are run concurrently and removed if they yield a value greater or equal to ten. This is a simple demonstration of how to use a list of generators and an evaluation loop as a green threads system, but this is not what happens in practice

Generators as coroutines

We have taken a glimpse at generators in the above section and how we can work with them as green threads. Looking at how we implemented our system earlier, it looked like we had a set of running routines (functions) that participated in the cooperative multithreading model in that one routine yielded control back to the executor to run the next routine after it was done with its temporary execution. These cooperatively running routines are called coroutines.

Generators feel like coroutines, but not as quite. This is because we can currently only retrieve data from them and cannot send data inside them for processing. They feel more like data producers, and indeed they are powerful data producers. We can also not explicitly close a generator unless it raises a StopIterationError on its own. This is important since we need to have full execution of our green threads and stop their execution in case something goes wrong. So to use them as coroutines, we need a key ingredient which is the ability to interact with these coroutines as they are running, probably to change their execution state or feed in more data for processing and more.

For that PEP 342: Coroutines via enhanced generators was devised in 2005. It extended the simple python generators from PEP 255 into more usable coroutines by adding the ability to send data into a running generator using now the newly added send method, send and error into the generator using the throw and also stop it if needed using the close method. With these changes, it means now we can implement proper green threads using coroutines as described in PEP 342. Note that the send method is not meant to send large chunks of data into a running generator, but you can use it to send control statements to change the execution state of a generator in case it was implemented as a finite state machine. Also if we call the send method with None as the argument, we're just telling the coroutine to execute in the same way we'd use the next built-in function. For fear of an exceedingly long blog post, I'll not place examples for this, but I'll add a link to a very detailed series of youtube videos by a python core developer explaining the internals quite simply and in detail.

Factoring out the generator

It became pretty apparent after the prominent use of generator-based coroutines that developers needed to factor out the "yield" statements from the coroutines to avoid redundancy. This was because generators could only yield to their immediate caller and it was getting hard to reuse some parts of code across different codebases. PEP 380: Syntax for delegating to a sub-generator was devised in 2009 to solve this problem, and it introduced the use of "yield from" to extract a generator part of a coroutine and place it in a sub-generator which would let the sub-generator be used a cross different modules and libraries. For brevity, more detail on the use of "yield from" can be read in this article.

Async, Await and Asyncio

If you followed through the evolutionary journey of the python coroutine stack, you're now at the final stage of the evolution cycle. In the previous section of Factoring out the generator, we talked about the introduction of the "yield from" statement and how it was used to factor out sub-generators to create reusable code. Well, it turns out that "yield" and "from" are two independent python keywords that are fundamentally different in how they are interpreted by python developers, and having them work together in this way was getting confusing. Not to mention, the industry had begun standardizing a given set of keywords to use for coroutines, namely "async" and "await statements. To sort out the confusion and also meet industry standards, PEP 492: Coroutines with async and await syntax. The PEP replaced the use of "yield from" with "await", and also mandated that every function that has an "await" keyword inside its block should be marked with an "async" keyword. Fundamentally, the async-await coroutines are still generators, and they have the same API as generators do.

With the async and await syntax established, the asyncio library was also standardized in PEP 3156: Asynchronous IO support rebooted, the "asyncio" Module after much work on the Tulip project by Guido van Rossum himself

Conclusion

This blog post was meant as a study of the origin and importance of asynchronous programming, especially in python, and how it evolved with time. It's not meant as a tutorial for the asyncio library and writing async code, but rather to answer a few questions about why you should take time to learn asynchronous programming. There are lots of tutorials out there to help you get started with asyncio and I wouldn't wish to duplicate the efforts, I'll just add links to what I feel is the best material out there for learning asyncio.

This blog post is but a mere fraction of the entire plethora of knowledge out there about asynchronous programming in python, but I do hope it helps you answer the most fundamental questions that I also had to ask while starting out my journey of mastering asyncio.