This post uses the Interactive Brokers Python TWS API to explain Thread Synchronization using Event Object.
In many applications, sometimes, we need to pause the running of the program until some external condition occurs. You may need to wait until another thread finishes, or another callback is processed. In these situations and and many other similar situations you will need to figure out a way to make your script wait. Three common ways to achieve this are mentioned:
- This is sometimes achieved by using a "trial and error approach using the time.sleep() function in Python". While this may work, it is always a problematic issue in determining the amount of time we need to wait. And using the time.sleep() in such cases is not the right way of achieving the result.
- Another approach is to use flags to arrive at a better approximation in using the time.sleep() function.
- Use the Event Object to achieve thread synchronization
Here is my Python program.
It is a very simple program to retrieve the End of Day Data (EOD data) for 3 months of a contract named TCS. These are the steps in the program:
- Establish Connection with TWS
- Define the contract
- Call reqHistoricalData
- Collect 'Date' and 'Close' information from the callback
- Process it and write to CSV
- Disconnect from TWS
Now pay attention to the time.sleep() function call. It is used totally 2 times in the program.
- After establishing a connection to TWS : time.sleep(3)
- After making a call to reqHistoricalData : time.sleep(7)
Approximately it took 10.1 seconds !!!. Now the most obvious question to ask is - Does it really take so much time to retrieve one single stock from TWS ? And the obvious answer is NO.
First Round of Optimization:
Lets analyze the need for the 1st time.sleep(3) statement made after attempting a connection with TWS. This sleep is primarily to ensure that the connection is established. And if we search through the documentation of TWS API , it states clearly that " You have to make sure the connection has been fully established before attempting to do any requests to the TWS. Failure to do so will result in the TWS closing the connection. Typically this can be done by waiting for a callback from an event and the end of the initial connection handshake, such as IBApi.EWrapper.nextValidId or IBApi.EWrapper.managedAccounts."
So it appears that we have a better alternative to waiting for 3 seconds. We can just check if the nextValidId callback has been completed. So this is how our code would be:
While we have reduced the time.sleep() substantially, we need to consider the CPU resources in continually putting the program to sleep.
Now lets take a look at the other time.sleep(7) function which we invoked after making a call to reqHistoricalData. Now this time could have been 3 seconds as well, but I have noticed that while retrieving NIFTY data, sometimes it could take as much as 1 minute to get 2 years EOD data. A better alternative to guesswork would be to set a flag after the historicalDataEnd callback.
Note that If reqHistoricalData was invoked with keepUpToDate = false, once all candlesticks have been received the IBApi.EWrapper.historicalDataEnd marker will be sent. To re-phrase this in a programmatic language - if the historicalDataEnd callback is trigerred, it means that all the data we need from the historicalData callback has been received and so we can continue processing the program. To achieve this, I have created a flag "reached_historicalDataEnd_callback" and use this to reduce the time.sleep() to a more appropriate period.
This is how the revised program would look:
This is the output when I run the program:
From 10.1 seconds, we are now down to 4.5 seconds. Isn't that some Optimization ? But here is the problem -
Although the sleep time in the while-loop appears to be relatively less. But internally, Python is forced to continuously evaluate result is None (while client.orderId is None), and the fact that Python concentrates on repeating this evaluation with a lesser sleep time leads to burning a lot of CPU cycles, and making everything else running on that CPU much slower!!!
This type of wait loop is often called a 'busy wait with sleep'. And a CPU that is stuck doing a lot of work over nothing as in this case is said to be spinning. Never do this, particularly so if you plan to host this code on AWS or any other cloud based system where they charge you for the CPU resources.
Second Round of Optimization:
Can we optimize it further ??? Check the program below.
Now lets take a look at the time that it has taken to retrieve the data.
0.27 seconds instead of the above 4.5 seconds . Now isn't that some progress in " time optimization " which we have achieved by using the Even Object for Thread Synchronization ?
Event Object
The Event class object provides a simple mechanism which is used for communication between threads where one thread signals an event while the other threads wait for it. So, when one thread which is intended to produce the signal produces it, then the waiting thread gets activated.
An internal flag is used by the event object known as the event flag which can be set as true using the set() method and it can be reset to false using the clear() method. The wait() method blocks a thread until the event flag for which it is waiting is set true by any other thread.
- isSet() method: This method returns true if and only if the internal flag is true.
- set() method: When this method is called for any event object then the internal flag is set to true. And as soon as set() method is called for any event all threads waiting for it are awakened.
- clear() method: This method resets the internal flag to false. Subsequently, threads calling wait() on the event for which clear() is called, it will block until the internal flag is not true again.
- wait([timeout]) method: When we have to make any thread wait for an event, we can do so by calling this method on that event which has the internal flag set to false, doing so blocks the thread until the internal flag is true for the event. If the internal flag is true on entry, then the thread will never get blocked. Otherwise, it is blocked until another thread calls set() to set the flag to true, or until the optional timeout occurs. The timeout argument specifies a timeout for the operation in seconds.
The difference between the Event.wait() and Thread.join() methods is that the latter is pre-programmed to wait for a specific event, which is the end of a thread. The former is a general purpose event that can wait on anything.
Conclusion:
I hope this article motivates you to think more carefully about how inefficient the time.sleep() calls are in your multithreaded code. It needs to be completely done away with in Production Code.
No comments:
Post a Comment