Today, while debugging an issue with Ray, I encountered a problem because I did not notice the special behavior of Python’s asyncio.run_coroutine_threadsafe
. The official documentation does not explicitly mention it; you have to look at the CPython source code to understand what is happening, which led to a long debugging session. I am documenting this issue here.
According to the
official documentation
, this function is used to run a coroutine in an event loop on another thread, and it returns a Future
object.
Based on the documentation, we can quickly write a toy example:
|
|
The main
function creates a thread and an event loop, then submits the coroutine f
to that event loop using asyncio.run_coroutine_threadsafe
. It then retrieves the result with future.result()
, handles exceptions, and finally stops the event loop. Everything seems to work fine.
The output is:
Inside coroutine f()
Caught exception: ValueError()
Stopping loop
Now, here is the question: what happens if the coroutine f
raises SystemExit
instead of ValueError
?
According to the
official documentation
, SystemExit
inherits from BaseException
rather than Exception
. So would it be enough to change Exception
to BaseException
on line 21? You will find that after making these two changes and running the program again, it prints Inside coroutine f()
and then hangs indefinitely.
We need to check the
source code of asyncio.run_coroutine_threadsafe
to understand what is happening. Here is the relevant part:
|
|
This function treats SystemExit
and KeyboardInterrupt
specially by directly raising them, while for other exceptions, it calls future.set_exception
. Since it does not call set_exception
for these two exceptions, future.result()
never gets a result.
Here is a modified version of the original program:
|
|
The output is:
Inside coroutine f()
Timeout
Future done? False
Future cancelled? False
Loop alive? False
Thread alive? False
Stopping loop
Task exception was never retrieved
future: <Task finished name='Task-1' coro=<f() done, defined at /home/mortalhappiness/test/test.py:5> exception=SystemExit()>
Traceback (most recent call last):
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
self.run()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/threading.py", line 1012, in run
self._target(*self._args, **self._kwargs)
File "/home/mortalhappiness/test/test.py", line 11, in start_loop
loop.run_forever()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/asyncio/base_events.py", line 641, in run_forever
self._run_once()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/asyncio/base_events.py", line 1986, in _run_once
handle._run()
File "/home/mortalhappiness/miniforge3/envs/test/lib/python3.12/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/home/mortalhappiness/test/test.py", line 7, in f
raise SystemExit
SystemExit
This behavior is a bit tricky. Both future.done()
and future.cancelled()
return False
, and calling future.result()
or future.exception()
will hang indefinitely without returning anything. However, if you do not call them, you will see a warning stating that a task exception was never retrieved. And you can see that both the event loop and the thread died.
Conclusion
If you use asyncio.run_coroutine_threadsafe
to execute a coroutine that raises SystemExit
or KeyboardInterrupt
, calling future.result()
or future.exception()
will not work and will hang indefinitely. Additionally, the thread and event loop will terminate, but the main thread will remain alive.