Friday, August 2, 2013

How to emulate Java "synchronized" keyword in C++

Objective


The objective of this article is to show how to provide the keyword 'synchronized' in C++ that works like the well-known 'synchronized' keyword in Java for locking and unlocking blocks of code.


Example:
Imagine you have a FIFO queue were you can add items from from different producer threads and there is another consumer thread that picks up the items for processing.


Listing 1:

01. public class ItemProcessor {
02.     private ArrayList<Item> queue = new ArrayList<Item>();
03.
04.     public void putItem(Item item) {
05.         synchronized(this) {
06.         queue.add(item);
07.         }
08.     }
09.
10.     public int processItem() {
11.         Item item = null;
12.         synchronized(this) {
13.             if (queue.isEmpty()) {
14.                 return 0;
15.             } else {
16.                 item = queue.(0);
17.                 queue.remove(0);
18.             }
19.         }
20.         return processItem(item);
21.     }
22.
23.     private int processItem(Item item) {
24.         int result = 0;
25.         // ... do something with item and set result
26.         return result;
27.     }
28. }

In Java you can clearly see which parts of the code are synchronized and which are not.

Motivation


During the years working as a professional Software Engineer I saw a lot of C++ code that uses spinlocks or mutexes that have to be locked and unlocked manually. Even in situations where the scope is very clear and stays within a single method or function this could be very error prune.

As most of you know, it's very important to have the lock and unlock calls balanced to prevent deadlocks or race-conditions.

Let's look especially at the more complicated method 'processItem' and how it would look like in C++ with the manual locking and unlocking.

Listing 2:

01. class ItemProcessor {
02.     private:
03.     Mutex mutex_;
04.     std::vector<Item*> queue_;
05.
06.     public:
07.     void putItem(Item* item) {
08.         mutex_.lock();
09.         queue_.push_back(item);
10.         mutex_.unlock();
11.     }
12.
13.     int processItem() {
14.         Item* item = NULL;
15.         mutex_.lock();
16.         if (queue_.empty()) {
17.             mutex_.unlock(); // If missing -> Candidate for a deadlock!
18.             return 0;
19.         } else {
20.             item = queue_[0];
21.             queue_.erase(queue_.begin());
22.         }
23.         mutex_.unlock();
24.         return this->processItem(item);
25.     }
26.
27.     private:
28.     int processItem(Item* item) {
29.         int result = 0;
30.         // ... do something with item and set result
31.         return result;
32.     }
33. }

As you can see, you need the, likely to be forgotten, unlock statement in line 17, to correctly balance your locks and unlocks.


As I am not only a C++ programmer, but also used Java very intensively in the past years, I was always very attracted to see how simple it is to work with synchronized blocks in Java compared to the inconvenient usage of manual locks and unlocks in C++  or Objective-C (Objective-C 2.0 has also "@synchronized" now).

Therefore I thought about a way how to extend C++ by a proper 'synchronized' keyword that is semantically equal to synchronized in Java. Furthermore I was curious if there is even a way to also achieve syntactical equality.

Evolvement of a solution


My first approach is to implement a template class called SynchronizedBlock, that takes a lock class L as a parameter during template initialization. The specific type of the lock class does not matter and it can be either a spinlock or a mutex as long as it provides the instance methods 'void lock()' and 'void unlock()'.

The template class looks like:

Listing 3:

01. template<typename L> class SynchronizedBlock {
02.     public:
03.         SynchronizedBlock(L& lock) : lock_(lock) {
04.             lock_.lock();
05.         };
05.         ~SynchronizedBlock() {
06.             lock_.unlock();
07.         }
08.     private:
09.         lock_;
10. };

If you modify the C++ implementation of the code in Listing 2 and use the new template class SynchronizedBlock as a helper for locking and unlocking from Listing 3, your doing nothing else than following the famous Resource Acquisition is initialization (RAII) pattern invented by B. Stroustrup.

Listing 4:

01. //...
02. int processItem() {
03.     Item* item = NULL;
04.     {
05.         SynchronizedBlock block(mutex_);
06.         if (queue_.empty()) {
07.             return 0;
08.         } else {
09.             item = queue_[0];
11.             queue_.erase(queue_.begin());
12.         }
13.     }
14.     return this->processItem(item);
15. }
16. //...

As you can see in Listing 4, the error prune line 17 as shown in Listing 2 is gone. Manual unlocking is not necessary, because the lock for the mutex of our local variable block is called within the constructor and unlock is called during destruction of block. In this case block is destructed either in line 7 or in line 13 (when the closing blue curly bracket is reached). To make this automatism happen, it is important to create the variable block on the stack and not on the heap (as you can see, there is no 'SynchronizedBlock* block = new SynchronizedBlock(mutex_);' statement). The second important thing you'll notice is that it's necessary to introduce the blue scope brackets to ensure the correct lifetime for our block variable. The destructor must be called before line 14 to ensure semantical equality to the original implementation in Listing 2.

Needing the blue curly brackets is the point that was still bugging me. In contrary to the Java 'synchronized' I have to define the lifetime scope of my synchronized block manually before I can create the block variable. Something, which makes the usage still a bit inconvenient.

Improvement of the syntax


To get rid of this syntactical flaw that is needed to ensure semantical correctness I remembered that C++ allows to declare variables within a for-loop followed by curly brackets for the loop body. I thought, could use this fact to my advantage e.g.

Listing 5:

01. //...
02. for (int i=0, c=5; i<c; ++1) {
03.     // do something
04. }
05. //...

The lifetime of the variables "i" and "c" are exactly until line 4 of Listing 5.

To achieve my goal to have a convenient syntax for synchronized block in C++ that should look like

Listing 6:

01. synchronized(mutex) {
02.     // do something 
03. }

it needed another helper template class called 'SynchronizeGuard' with the following implementation:

Listing 7:

01. template<typename L> class cSynchronizeGuard {
02.     public:
03.         cSynchronizeGuard(L& lock);
04.         ~cSynchronizeGuard();
05.         bool isLocked() const;
06.         void lock();
07.         void unlock();
08.     private:
09.         L& lock_;
10.         volatile bool state_;
11. };
12.
13. inline cSynchronizeGuard::cSynchronizeGuard(L& lock)
14. : lock_(lock), state_(false) {
15.     this->lock();
16. }
17.
18. inline cSynchronizeGuard::~cSynchronizeGuard() {
19.     if(state_)
20.         lock_.unlock();
21. }
22.
23. inline bool cSynchronizeGuard::isLocked() const {
24.     return state_;
25. }
26.
27. inline void cSynchronizeGuard::lock() {
28.     lock_.lock();
29.     state_ = true;
30. }
31.
32. inline void cSynchronizeGuard::unlock() {
33.     lock_.unlock();
34.     state_ = false;
35. }

With the help of the template class SynchronizeGuard and the knowledge about the scope of variables in a for-loop you can express the scope for the block that should be synchronized as follows (again taking the method 'processItem' from Listing 4 as an example):

Listing 8:

01. //...
02. int processItem() {
03.     Item* item = NULL;
04.     for (SynchronizeGuard guard(mutex_); guard.isLocked(); guard.unlock())
05.     {
06.         if (queue_.empty()) {
07.             return 0;
08.         } else {
09.            item = queue_[0];
10.            queue_.erase(queue_.begin());
11.         }
12.     }
13.     return this->processItem(item);
14. }
15. //...

Now, besides the fact, that the synchronization code and the blue curly brackets now have the correct order, this statement seems to be much more inconvenient to write than the code in Listing 4.

Hold on, even not recommended to be overused in C++, we have still the Preprocessor. Let's put that nasty line 4 from Listing 8 into a macro:

Listing 9:

01. #define synchronized(lock) \
02.     if(false) {} \
03.     else \
04.     for (SynchronizeGuard guard(lock); guard.isLocked(); guard.unlock())

Et voilà! Putting all together, the beautified implementation from Listing 2 can be expressed in C++ very similar to the code written in Java (Listing 1):

Listing 10:

01. class ItemProcessor {
02.     private:
03.     Mutex mutex_;
04.     std::vector<Item*> queue_;
05.
06.     public:
07.     void putItem(Item* item) {
08.         synchronized(mutex_) {
09.         queue_.push_back(item);
10.      }
11.     }
12.
13.     int processItem() {
14.         Item* item = NULL;
15.         synchronized(mutex_) {
16.             if (queue_.empty()) {
17.                 return 0;
18.             } else {
19.                 item = queue_[0];
20.                 queue_.erase(queue_.begin());
21.             }
22.         }
23.         return this->processItem(item);
24.     }
25.     // ...
26. };

5 comments:

  1. I didn't put much effort in configuring the layout! And, yes ... in my opinion: "Content is King!"

    ReplyDelete
  2. For the record (since I just came across this): C++11 has more or less exactly this functionality using RAII without using processor directives:

    std::mutex m;
    [...]
    {
    std::lock_guard l(m);
    // lock is automatically released after end of block because object is deconstructed
    }

    ReplyDelete
    Replies
    1. Yes, that's true. At the time I implemented this solution I was working with MinGW and a GCC 3.x or similar. If you take a closer look, I used the preprocessor directive only to mimic syntax. With the "#define synchronize" you can instantiate the guard-object before the curly brackets that enclose the synchronized code section. This syntactical sugar also can AFAIK only be added via preprocessor to your code. The implementation of the SynchronizeGuard on the other hand is also based on RAII. So the only difference between my solution above and your code is that you use a standard lock-guard instead of a custom implementation. Thanks for your comment!

      Delete
  3. Why the 'if (false) {} else' in the macro?

    ReplyDelete
    Replies
    1. The problem is that there was a bug in an older Visual Studio compiler, which allowed to have the scope of SynchronizedGuard even after the for loop and therefore the Destructor was never called.

      See also: What is the possible use for “#define for if (false) {} else for”?

      Delete