Today, while debugging an issue with Ray, I encountered a problem because I did not notice the special behavior of Python’s asyncio.run_coroutine_threadsafe. The official documentation does not explicitly mention it; you have to look at the CPython source code to understand what is happening, which led to a long debugging session. I am documenting this issue here.
According to the
official documentation
, this function is used to run a coroutine in an event loop on another thread, and it returns a Future object.
Based on the documentation, we can quickly write a toy example:
| |
The main function creates a thread and an event loop, then submits the coroutine f to that event loop using asyncio.run_coroutine_threadsafe. It then retrieves the result with future.result(), handles exceptions, and finally stops the event loop. Everything seems to work fine.
The output is:
Inside coroutine f()
Caught exception: ValueError()
Stopping loop
Now, here is the question: what happens if the coroutine f raises SystemExit instead of ValueError?
According to the
official documentation
, SystemExit inherits from BaseException rather than Exception. So would it be enough to change Exception to BaseException on line 21? You will find that after making these two changes and running the program again, it prints Inside coroutine f() and then hangs indefinitely.
We need to check the
source code of asyncio.run_coroutine_threadsafe
to understand what is happening. Here is the relevant part:
| |
This function treats SystemExit and KeyboardInterrupt specially by directly raising them, while for other exceptions, it calls future.set_exception. Since it does not call set_exception for these two exceptions, future.result() never gets a result.
Here is a modified version of the original program:
| |
The output is:
Inside coroutine f()
Timeout
Future done? False
Future cancelled? False
Loop alive? False
Thread alive? False
Stopping loop
Task exception was never retrieved
future: <Task finished name='Task-1' coro=<f() done, defined at /home/mortalhappiness/test/test.py:5> exception=SystemExit()>
Traceback (most recent call last):
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
self.run()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/threading.py", line 1012, in run
self._target(*self._args, **self._kwargs)
File "/home/mortalhappiness/test/test.py", line 11, in start_loop
loop.run_forever()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/asyncio/base_events.py", line 641, in run_forever
self._run_once()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/asyncio/base_events.py", line 1986, in _run_once
handle._run()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/home/mortalhappiness/test/test.py", line 7, in f
raise SystemExit
SystemExit
This behavior is a bit tricky. Both future.done() and future.cancelled() return False, and calling future.result() or future.exception() will hang indefinitely without returning anything. However, if you do not call them, you will see a warning stating that a task exception was never retrieved. And you can see that both the event loop and the thread died.
Conclusion
If you use asyncio.run_coroutine_threadsafe to execute a coroutine that raises SystemExit or KeyboardInterrupt, calling future.result() or future.exception() will not work and will hang indefinitely. Additionally, the thread and event loop will terminate, but the main thread will remain alive.
](https://static.chishengliu.com/posts/run-coroutine-threadsafe-systemexit/cover/cover.jpg)