Today I saw a amazing tech talk on the much debated topic of python Global Interpreter Lock a.k.a GIL. Python’s Global Interpreter Lock or GIL is probably one of the least properly understood parts of python, but is extremely important to know of if you are dealing with any form of multi-threaded python code.
The talk debunks several GIL myths, explains GIL/Interpreter internals and introductory talk about the better GIL implementation coming up in version 3.2 of python. Apart from the highly informative talk I liked speaker’s way of approaching the rather subtle topic in a manner that even novice programmers could relate to, showing interpreter internal code as required, demos showing GIL influencing overall program performance in cases when multi-core systems, running multiple threads etc are involved, and last but not the least the occasional geek humor.
Few takeaways from the Talk:
- Python threads are native threads (POSIX threads on Unix) and not Green threads.
- There can only be one thread running at any given time in a python process regardless of whether you are using multiple threads, due to the way GIL works, independent of the OS multi-threading support. In other words only one thread can have mutual exclusive access to GIL at any given time.
- Splitting a process into multiple threads won’t speed up the execution, On the contrary it might degrade overall performance as now these threads will compete for GIL access.
- If you are running multiple threads on a multi-core system, performance might be even worse than compared to single core machine, as OS now supports running multiple thread simultaneously, but GIL will still execute only one thread. So OS will try resuming suspended threads which will then try to get hold of the GIL, which will fail because some other thread is already using GIL. In other words there will be lot of False alarms, which will cause thrashing leading to overall degraded performance.
While you should not worry about your multi-threaded code having performance issues because of the GIL, more often than not it’s because of badly written code. However as a general rule of thumb if you don’t need to synchronize threads/shared resources access but are only concerned about reducing process execution time, you are probably better off using multiple processes instead of multiple threads. See Multiprocessing for further information on how multiprocessing can replace multi-threading for this case.