212 reads

Learn Multithreading in Python Like A Pro Using These 30 Questions

by MichaelApril 30th, 2024

Too Long; Didn't Read

Use these 30 questions to learn how to multithread like an expert in Python.

featured image - Learn Multithreading in Python Like A Pro Using These 30 Questions

I used this to review Python multithreading for upcoming interviews; I hope it helps you. PS: It is repetitive to help with memorization — you can’t understand if you don’t remember.

Must-Know Terms

Thread — A sequence of instructions within a process that can be executed independently.
Process — An instance of a program running on a computer, encapsulating its own memory space and resources.
Interpreter — The software that reads, translates, and executes Python. One interpreter per Python process.
Global Interpreter Lock (GIL) — A mechanism that ensures only one thread executes at a time, limiting parallelism in multi-threaded programs. One GIL per interpreter.
CPU bound — Operations that do most of their work through the CPU e.g. crunching numbers, transforming data, etc.
I/O bound — Operations that read or write data through the input/ouput devices e.g. making calls to an API, reading from disk, etc.
Multithreading — Executing multiple threads in one process, to parallelize a task best used for I/O bound operations. Limited by the GIL to one thread executing at a time.
Multiprocessing — Using multiple processes to accomplish a task is best used for CPU bound tasks because processes are limited by the GIL and can be run at the same time. Hard to share data between processes.
Memory leak — The unintentional escape or misuse of memory i.e. forgetting to free memory after using it.
Memory corruption — Altering data in a fashion that makes it unreadable or harmful to read by the program.
Lock — An object used to control access to a shared resource, usually another object.
Semaphores — These are locks with counters, they keep track of the number of calls made to acquire() and release().
Deadlock — A situation in which two or more processes are unable to proceed because each one is waiting for the other to release a resource (call release() on a lock).
Race condition—An insidious bug where two or more threads make conflicting changes to the same data, but the order of those changes is dictated by when the threads start, so the outcome becomes unpredictable. Locks prevent this.
Data corruption — Unlike memory corruption, there is nothing wrong with the data in memory, but it may be a value that is unexpected by the program, e.g., a thread emptying a list while another thread is using it.

Questions To Ask Yourself About Python Multithreading

What are the gotchas I should look out for?
What is a thread?
What is multithreading?
How does multithreading differ from multiprocessing?
When should I use threads vs processes?
What is the Global Interpreter Lock (GIL)?
What is the Global Interpreter Lock (GIL) on multithreading?
How do I create a new thread?
How do I pass arguments into a thread?
How do I start a thread? What are the three gotchas for starting threads?
What are the different types of threads?
What is a daemon thread?
How do I create a daemon thread?
How do I check if a thread has started? If it is running? Stopped?
How do I wait for a thread to finish? What is thread joining?
Can I timeout thread.join() — how do I wait X seconds for a thread to finish?
How do I stop a thread?
Can I restart a thread?
Does a thread exit when it runs into an unhandled exception?
What happens to unhandled exceptions in threads? Does it cause an error in the parent thread?
How do I handle exceptions thrown in threads?
What are the four limitations of threading.excepthook?
How do I exchange information between threads?
What is thread safety?
What happens if I don’t use thread-safe objects?
Why do I need thread-safe objects if Python has the Global Interpreter Lock (GIL)?
Do all objects passed into threads have to be thread-safe?
How do you synchronize threads to avoid race conditions? How do you tell one thread to wait for another?
How can I keep track of resources used by multiple threads? What are semaphores?
How can I schedule a thread to start at a particular time?
Where can I learn more?

What are the gotchas I should look out for?

Don’t use threading.Thread, there are better alternatives that require less setup, have friendly interfaces, and are more forgiving — only use threading.Threads if the two options below are insufficient (both use threading.Thread underneath).

Asyncio — gives you the option to switch to using processes instead of threads
concurrent.futures.ThreadPoolExecutor — handles creating, executing, and keeping track of threads

Threads do NOT execute code simultaneously because the Global Interpreter Lock limits the CPU to execute only one thread per Python process. Threads are not run in parallel.

Threads aren’t meant for everything.

If you are doing CPU bound tasks (math, data transformations, etc.), use multiprocessing, not multithreading — if you are doing I/O bound tasks (network calls, DB queries, disk reads/writes), use multithreading.
But if you need to do a lot of communication between your tasks, use threads — shared memory makes it easier to communicate vs Inter Process Calls (IPC), network calls, third-party services, pickling, etc., for processes
Need to use shared memory i.e. work on one object, threads

You have to keep track of the thread’s state.

You can only call thread.start() once.
You can call thread.join() multiple times but…
It will error when calling thread.join() on itself, i.e., you can’t join a thread on itself because it will create a deadlock.
You can’t call thread.join() on a thread that has not started — causes a runtime error.

Stopping threads is tricky.

Threads can cause your program to freeze. Once you start a thread, it does NOT stop until it is completed or runs into an unhandled error — even if the process tries to exit (shutdown the program), the thread keeps running and prevents the exit (unless it is a daemon thread).
Daemon threads don’t freeze your program — you can exit the current thread, and if the daemon thread is the only thing alive, the Python will kill, but this may cause memory leaks (DB connections, etc., won’t be cleaned).

Threads don’t start and stop when you want them to but when they can.

thread.start() does NOT immediately execute the function passed into the thread — it attempts to find a free thread, but if all the process’s threads are busy, it waits for them to be free.
thread.is_alive() will return False even if the thread is not freed (after the code is run, the thread needs to be cleaned up and waits to do so). Be careful if you want to reuse that same CPU thread to run a different piece of code or if you want to immediately start another job thinking you have a thread that is free.

You can’t rely on Python to gracefully handle errors in threads.

Unhandled errors thrown in child threads will be silent unless you catch them or use threading.excepthook (in the parent) to listen for errors.
threading.excepthook will not be called when a system kill message is sent i.e. when the program is abruptly closed.
Errors in threading.excepthook will be sent to the System errors handler, so make sure to set that up if you want to do complex logging/alerting in excepthook.
Delete the trace and error objects in excepthook, don’t pass them around or you will get a circular dependency and eventually a memory leak.
Start/Run/Join don’t propagate exceptions thrown within the thread.

Debugging threads can be hard; don’t treat errors in threads like regular errors.

Threads share the same memory but NOT the same callstack — error traces will be available but out of context in threading.excepthook. The error traces won’t show who started the thread, only what function the thread ran and where the error occurred in that function.
Once the thread throws an error, it goes through the clean-up process. This means you won’t have access to actual objects (or any other objects from within the thread) that caused the error when you handled the error in excepthook.
Race conditions are a pain — this bug occurs when two threads try to access the same resource, and the order of their access is completely random, resulting in unpredictable states. Imagine a banking app that uses two separate threads for deposits and withdrawals. If a customer decides to deposit and withdraw at the same time but these two threads are not synchronized with each other (e.g., do deposits first always), the customer can end up with a negative balance. This is what locks are meant to solve, but it is very hard to detect race conditions since the outcome is so random — you have to wait until something goes wrong.

Forgetting that threads need thread-safe objects to avoid race conditions and data corruption.

It is easy to pass objects into threads and start changing them without realizing that other threads may depend on this object. Before you pass anything into a thread, remember to ask yourself what you will do to it.

Forgetting to release locks — deadlocks.

Every call to acquire() should have a mirror call to release(). This is very easy to forget. A good coding habit is to never write one of these methods alone — always write them together, even if you don’t know when you call the other method.

What is a thread?

A thread is a semi-isolated stream of operations — a single worker at a factory is a thread. They can do work without supervision, but since they work in the factory, they have to share the same resources as everyone else in the factory. Threads share memory with the parent thread that created them and their sibling threads. The call stack, the series of functions executed by the thread, is unique and separate for each thread — this means you won’t be able to find where the thread was started using the error trace, only what function the thread executed.

What is multithreading?

Multithreading is an approach to solving problems by splitting them into multiple independent execution paths. Imagine a worker at a car factory; that worker can do everything to build the car, or we can have multiple workers, each building different parts of the car separately and then putting them together. The latter approach is more efficient under certain circumstances. Multithreading is just that, using more workers (threads) to do a task concurrently (together but not at the same time).

But don’t just use multithreading everywhere. It is more expensive resource-wise, and the code is more complex. Knowing when to use multithreading is 80% of the journey. Once you know, creating and managing threads is quite easy. Even synchronizing them can be simple with the correct mental model.

How does multithreading differ from multiprocessing?

Suppose a thread is a worker at a factory. A process is the factory. It is an isolated stream of operations and resources that contains at least one thread.

Multithreading

Belongs to one process.
Semi-isolated — Shares the same memory as other threads in the process.
Separate callstack/Isolated execution context — separate function callstack.
Semi-parallel — Limited by Global Interpreter Lock (GIL), all threads share the same Python interpreter, so only one thread can be executed by the CPU at a time.
Resilient — Errors can only take down the thread they occur in. The process continues and can create more threads.
Shared memory — Threads can share variables with each other, which makes communication between them easier. Only threads within the same process.

Multiprocessing

Autocephalous — Does not belong to another process.
Parrallel— Each process has its own GIL, so each process can be run in parallel on the CPU.
Error-prone — If an error is unhandled, it takes down the entire process.
Isolated — Has its own memory space and callstack.
Hard communication — Have to use IPC, HTTP, Pickling, created shared memory, or another mechanism to communicate between processes.

When should I use threads vs processes?

I/O heavy tasks — reading and writing to/from disk, DB, network, etc. where there is a lot of waiting.
Error-prone tasks — it is cheap and easy to restart threads, and one thread failing won’t take down the entire process. If you have a task that fails unpredictably, run it in a separate thread.
Communication heavy — if you have a task that relies a lot on moving data from one worker to another using threads, the savings in latency by communicating via shared memory will overcome the latency of GIL.
Need many workers — if you need lots of workers, then threads are usually the way to go. The number of threads your computer can spin up and run is always higher than the number of processes (although the latter is virtually infinite, the number processes that can actually do work is limited by the number of CPUs and cores).

But you shouldn’t use threads or processes directly; you should use the AsyncIO module — whatever your problem is, you probably don’t need to implement a solution from scratch using threads or processes. AsyncIO is an abstraction over multithreading and multiprocessing — it turns your Python script into an event-driven script; instead of waiting for network or DB calls to be finished, you can pass those calls a callback and continue running the script. AsyncIO will spin up a thread or process (if you want it; sometimes, it can get away with just using the main thread) to handle the waiting and response. AsyncIO handles all the multithreading and multiprocessing use cases. It is well-tested and documented. If that’s not enough, there are other libraries built on top of AsyncIO for specific tasks like HTTP communication.

What is the Global Interpreter Lock (GIL)?

It is the toll gate at the heart of Python. For each Python process, there is exactly one Python interpreter (the thing that reads and executes your Python code). However, processes can have multiple threads running at the same time. What if two threads try to use the same location in memory, e.g., the same object at the same time? What if a thread tries to delete an object that is being used by another thread? All these behaviors are unpredictable and would cause errors. The GIL exists to solve these problems in the simplest possible way by making these scenarios impossible or at least less likely. It does this by limiting how many threads can be executed by the interpreter at a time. That’s why it is called a lock. It locks the interpreter from all other threads while a thread is executing. You don’t have to worry about memory corruption while multithreading in Python.

What is the Global Interpreter Lock (GIL) impact on multithreading?

The GIL limits one thread per process (each process has one Python interpreter) to execute on the CPU simultaneously. This prevents threads from executing conflicting code, e.g., deleting an object another thread is using. However, this limits parallelism; operations that use the CPU (math, data manipulation, etc.) are run sequentially when using threads. This means that Python threads are not run in parallel—the threads are not running at the same time.

How do I create a new thread?

Import the threading module.
Create a new instance of threading.Thread class.
Pass it the function you want to execute on the thread.

import threading

# Define a function that will be executed in the new thread
def print_numbers():
  for i in range(1, 6):
    print("Number:", i)

# Create a new thread and specify the function to execute
thread = threading.Thread(target=print_numbers)

How do I pass arguments into a thread?

One argument

threading.Thread(target=print_numbers, args=[6])

args expects only iterables (tuples, lists, etc.) to pass a single variable use (6,) or [6] — (6) is not a tuple; it is just 6

Multiple arguments

threading.Thread(target=print_numbers, args=[ARG1, ARG2, ARG3, …, ARGN])

Keyword arguments

threading.Thread(target=print_numbers, kargs={ 'key1': val1, 'key2': val2 })

How do I start a thread? What are the three gotchas for starting threads?

Call start() on the thread object to start it:

import threading

def print_numbers(num):
  for i in range(1, num):
    print("Number:", i)

thread = threading.Thread(target=print_numbers, args=[6])
thread.start()

But beware:

You can only call start() ONCE, and it will error if you do it a second time. So, you need to keep track of which threads have started and stopped.
start() does NOT immediately execute the function passed into the thread — it attempts to create a thread, but if the max number of threads for the process has been created, it waits for a thread to be free.
Once you start the thread, it does NOT stop until it is completed or runs into an unhandled error — even if the process tries to exit (shutdown the program), the thread keeps running and prevents the exit (unless it is a daemon thread).

What are the different types of threads?

Regular threads — will keep the program alive until they are done
Daemonic threads — can be abruptly killed by the program

What is a daemon thread?

Daemons (pronounced like “demon” but not as nefarious, more like spirits) are threads meant for background jobs. They are excellent for tasks that can or need to be abruptly closed.

But daemonic threads come with caveats.

When the program exits, the daemonic threads are aborted — they don’t keep the program open.
If only daemonic threads are left, the program will automatically shut down.
This abrupt shutdown prevents open files, connections, etc., from being released, causes memory leaks, and freezes other programs (the ones on the other end of the connections).
By default, daemonic threads only spawn daemonic threads — be careful when creating threads inside of daemonic threads; you may accidentally cause memory leaks (see point 3).

How do I create a daemon thread?

Python 3.3 and later:

thread = threading.Thread(target=print_numbers, daemon=True)

All older versions of Python are very error-prone since you can forget to set the flag.

thread = threading.Thread(target=print_numbers)
thread.daemon = True

A common pattern in older versions of Python to avoid forgetting to set the flag:

class DaemonThread(threading.Thread):
  def __init__(self, target=None, args=(), kwargs={}):
    super(DaemonThread, self).__init__(target=target, args=args, kwargs=kwargs)
    self.daemon = True

How do I check if a thread has started? If it is running? Stopped?

Started, use thread.is_alive():

Returns true after thread.start() is called

Running, use thread.is_alive(), it will return true up until the code finishes execution but NOT the thread — docs.

This is set in the start() call AFTER a thread has been created or assigned

Stopped, use thread.is_alive(), it will return false

thread.is_alive() will return False even if the thread is not freed (after the code is run, the thread needs to be cleaned up and waits to do so). Be careful if you want to reuse that same CPU thread to run a different piece of code or if you want to immediately start another job thinking you have a thread that is free.

How do I wait for a thread to finish? What is thread joining?

You can use join() to wait for a thread to finish — this is called joining.

thread_a.join() # goto sleep until thread_a is finished
print("Main thread continues…")

thread.join() pauses the current thread.
You can call join() multiple times.
BUT you can only call join() on a started thread — if you have not called thread.start() or if the thread is stopped i.e. if is_alive() returns false, thread.join() will error.
You can’t call thread.join() on the current thread — this will cause an error.

Can I timeout thread.join() — how do I wait X seconds for a thread to finish?

You can pass thread.join() a timeout in seconds.

The docs say the timeout should be in floats, but the underlying C object is int64 — no need to use floats — also no need for it to be in nanoseconds; it looks like a typo.

timeout_in_seconds = 2
# waits two seconds than wakes up and executes next line
thread.join(timeout_in_seconds)

How do I stop a thread?

There is no easy or direct way provided by the Threading module or Thread class to stop a running thread. But you can:

Throwing an error in the code being executed by the thread (and not catching that error)
Passing in a flag to the thread from the parent thread — this flag will either return early or raise an error when set to True in the parent thread. This takes advantage of threads’ shared memory — they can share variables.
The most common approach, create a subclass of Thread and add a stop method that stops the thread using a flag or threading.Event.
Set the thread to a daemon thread (BEFORE starting it), and if you want to stop the thread, kill the program — daemon threads, unlike regular threads, stop when the process exits

An example of the event system:

import threading
from time import sleep

def worker(event):
  print('Thread started, waiting for start flag.')
  event.wait()
  print('Flag was set, thread awoken.')
  num = 0
  while event.is_set():
    # do work
    num += 1
    print('work done…', num)
  print('Thread stopped.')

event = threading.Event()

thread = threading.Thread(target=worker, args=[event])
# Start the thread but it waits on the event
thread.start()

# Set the flag to true
event.set()

# wait for the thread to do work
sleep(2)

# Stop the thread
event.clear()
thread.join()

An excellent article with examples: https://www.geeksforgeeks.org/python-different-ways-to-kill-a-thread/

Can I restart a thread?

No, you can only call start() once per thread object, BUT the physical thread (the thing in your CPU that does the work) can be reused multiple times. If you need to retry a failed thread, create a new thread and pass it the same function that the old thread had.

DON’T copy thread objects. This can cause memory problems — when the old thread is finished, its resources are freed, but the clone may try to access them, causing an error or preventing the old thread from being cleaned due to the copy referencing it, which may cause a memory leak.

Does a thread exit when it runs into an unhandled exception?

Yes, just like any Python script — a thread will exit when it encounters an error.

What happens to unhandled exceptions in threads? Does it cause an error in the parent thread?

The error is passed up to the parent thread, but it is NOT rethrown in the parent thread. It is made available through the Threading event system. It is up to you to listen for these errors with threading.excepthook and handle them there.

How do I handle exceptions thrown in threads?

By using threading.excepthook:

import threading
import traceback

def worker():
  print('Thread started, throwing error.')
  raise Exception('An error occurred in the worker thread.')

def errorHandler(args):
  print('An error occurred in the worker thread.')
  print(f'Exception type: {args.exc_type}')
  print(f'Caught exception: {args.exc_value}')
  # error tracebacks DO NOT include callstack of parent thread
  traceback.print_tb(args.exc_traceback)

# setup the error handler
threading.excepthook = errorHandler

thread = threading.Thread(target=worker)
thread.start()
thread.join()

print('Thread has finished.')

Excepthook callback is given an object with these attributes:

exc_type: Exception type.
exc_value: Exception value, can be None.
exc_traceback: Exception traceback, can be None.
Thread: Thread that raised the exception can be None.

What are the four limitations on threading.excepthook?

If exc_type is SystemExit (the thread was ended by the system), the exception is silently ignored — it does not go to excepthook.
If this excepthook raises an exception, sys.excepthook() is called to handle it.
Storing exc_value can create a reference cycle. When the exception is no longer needed, it should be cleared explicitly to break the reference cycle.
Don’t store threads passed into excepthook to avoid memory leaks.

How do I exchange information between threads?

Use a thread-safe object like Python’s queue module. You can safely store information on the queue and retrieve that information without worrying about synchronizing your threads (i.e., using locks). The queue will ensure only one thread updates the queue at a time.

Create the queue in the parent thread
Pass it into the child threads
Use the queue like any other queue in the child thread

What is thread safety?

Thread safety is a mechanism that involves locking objects to avoid common problems when multiple threads attempt to change a shared object simultaneously. It works by limiting which threads can access the object at any given time — some objects in Python are built thread-safe, but most are not.

What happens if I don’t use thread-safe objects?

Using non-thread-safe objects in threads can lead to these problems:

Race conditions — when it is impossible to tell the order of updates to the shared object
A banking app that uses two threads, one for deposits and another for withdrawals, to update the same customer account object. If the object is not thread-safe and the deposit and withdrawal thread both execute at the same time, this will lead to race conditions that can cause logic bugs like the customer’s withdrawal being declined because the withdrawal thread executed before the deposit thread. There is no way to tell which thread will execute first; the customer object needs to be thread-safe in order to ensure that deposits always occur before withdrawals.
Data corruption — when the updates lead to a bad or missing value
A food delivery app that uses a pool of threads to distribute the deliveries and a list of deliveries that these threads read the delivery data from can run into data corruption if the list is not thread-safe. If the delivery threads are reading data from the list, then remove the data once a delivery is complete. An error can occur if the list is being added at the same time as it is being pruned. The delivery threads may remove a delivery that was not delivered or a delivery that belongs to another thread, which can cause that thread to try to read data that is not available.
Memory corruption — when an update leads to a value that does not fit in memory or is the wrong type or when the thread tries to read data that is no longer in memory
Thanks to the GIL, this is unlikely to happen in Python.

Why do I need thread-safe objects if Python has the Global Interpreter Lock (GIL)?

Yes, even with the GIL, Python needs thread-safe objects. The GIL does not prevent race conditions or data corruption.

The GIL only prevents threads from executing code in parallel (at the same time), which ensures the built-in Python objects don’t cause memory corruption — writing to the same place in memory at the same time.
But the GIL does not stop race conditions or bad data — it does not stop one thread from editing an object that another thread depends on or is also writing to.
The GIL quickly switches which thread executes and does not track what those threads are actually doing; it is easy for two threads to make conflicting changes to the same object.

Do all objects passed into threads have to be thread-safe?

No — you can pass any object into a thread and use it in that thread. But if that object is also going to be used in another thread, it should be thread-safe.

If you are only doing read operations — reading the value of the object, the object does not have to be thread-safe
However, it is highly recommended that it be thread-safe because it is easy to accidentally update the object, which will cause logical errors. You can also wrap it in a class that prevents updates, e.g., throwing an error when you try to put an element in a non-thread-safe list.
If you are performing any write operations, such as changing the value of the object or its properties, then the object should be thread-safe.

How do you synchronize threads to avoid race conditions? How do you tell one thread to wait for another?

Use locks! Locks are special objects that we can pass into threads that can make threads sleep or wake. Imagine a busy factory with one bathroom (one toilet), all these workers want to use the toilet, and we don’t want them barraging in on each other, so we put a lock on the bathroom and provide a key for it. Workers end up queuing in front of the bathroom, waiting for it to be unlocked instead of constantly opening the door and seeing if there’s someone in there. Thread locks work the same way.

When you have something that you need to control access to or simply have to orchestrate one task before another, you introduce a lock. Pass the lock to the threads that need it. In the threads, you acquire() the lock before you access the resource. This is like checking if the bathroom is unlocked. If the thread can’t get the lock, acquire() puts the thread to sleep — the bathroom is locked, wait for it. When the bathroom is free, acquire() wakes the thread and gives it the lock. After this thread is done, it is CRUCIAL to remember to release() the lock i.e. unlock the bathroom. If the release() is NOT called, the other threads will never wake up.

The pattern is:

Wait for the lock using acquire()
Go to sleep until the lock is free
When the thread holding the lock calls acquire(), wake up and do the work
Release the lock()

Everything starts with acquire() and ends with release() — this is the crucial cycle you need to follow with all forms of concurrent programming. To prevent bugs, it is common to count the number of times acquire() and release() have been called — the number should always be equal.

lock = threading.Lock()
lock.acquire()

def worker():
  print('Getting lock.')
  lock.acquire()
  print('Have lock.')
  lock.release()
  print('Released lock.')

thread = threading.Thread(target=worker)
thread.start()

# thread is alive but HAS not finished because it is waiting for the lock to
# be released, hence the join() times out
timeout = 2
thread.join(timeout)

assert thread.is_alive(), 'The thread should still be alive'

# we release the lock, the thread can finish
lock.release()
thread.join()

How can I keep track of resources used by multiple threads? What are semaphores?

A semaphore is a lock with a counter. It can be used to keep track of how many times a shared resource is used. For example, if you have an object that allows your threads to connect to a server with only three connections at a time, you can use a semaphore with a counter set to three to limit the number of connections. Not doing this can lead to the threads failing because the server is unreachable.

This is how semaphores work:

When you create a semaphore, you give a number, which is the counter.
Every call to acquire() decreases the counter by one.
Every call to release() increases the counter by one.
When the counter reaches zero, a call to acquire() will cause the thread to go to sleep and wait for the counter to increase, i.e., for release() to be called.

import threading
import time

# Shared resource
shared_resource = {
  "connect": lambda: print("Connected to server."),
}
# Semaphore with a maximum of 3 permits
semaphore = threading.Semaphore(3)

# Function to access the shared resource
def access_shared_resource(thread_name):
  print(f"{thread_name} started, waiting for semaphore")
  semaphore.acquire()
  print(f"{thread_name} acquired semaphore")
  shared_resource["connect"]()
  time.sleep(1)  # Simulate some work
  print(f"{thread_name} released semaphore")
  semaphore.release()

# Create multiple threads to access the shared resource
threads = []
for i in range(5):
  thread = threading.Thread(target=access_shared_resource, args=(f"Thread-{i+1}",))
  thread.start()
  threads.append(thread)

# Wait for all threads to finish
for thread in threads:
  thread.join()


# Notice how the threads don't acquire
# the semaphore in the order they were created, nor
# do they execute in that order - this is due to the GIL
# randomly switching which thread is run.
# Thread-1 started, waiting for semaphore
# Thread-1 acquired semaphore
# Thread-2 started, waiting for semaphore
# Connected to server.
# Thread-2 acquired semaphore
# Thread-4 started, waiting for semaphore
# Thread-5 started, waiting for semaphore
# Thread-3 started, waiting for semaphore
# Connected to server.
# Thread-4 acquired semaphore
# Connected to server.
# Thread-2 released semaphore
# Thread-4 released semaphore
# Thread-1 released semaphore
# Thread-5 acquired semaphore
# Thread-3 acquired semaphore
# Connected to server.
# Connected to server.
# Thread-5 released semaphore
# Thread-3 released semaphore

How can I schedule a thread to start at a particular time?

Use threading.Timer, takes the same arguments as Thread but with an extra argument, called interval, for the number of seconds to delay before starting the thread.

Things to remember:

Timer is a subclass of Thread — everything you can do in Thread you can do in Timer, and the same limitations apply
timer.start() does NOT start the counter. Just like Thread, it waits for a free CPU thread to use
After a free CPU thread is found, then the countdown will begin
If the system has no free threads, you may wait longer than the delay
You can cancel Timers with timer.cancel() BUT only BEFORE the delay. If the thread is executing your code, it is too late — timer.cancel() will do nothing
If you cancel a Timer, you can still join it — don’t have to worry about skipping joins on canceled Timers

import threading
import datetime

now = datetime.datetime.now()
def print_hello(now, seconds):
  print(f"Hello, world, {datetime.datetime.now()}!")
  assert datetime.datetime.now() > now + datetime.timedelta(seconds=seconds), 'The timeout should have passed'

seconds = 5
timer = threading.Timer(
  seconds, print_hello, args=(now, seconds)
)

print(f"Starting timer at {now}")

timer.start()
timer.join()

print(f"Finished {datetime.datetime.now()}")