sunnuntai 3. heinäkuuta 2016

Multithreaded programming - it's harder than you think


There are often referred rules about optimization, sometimes referred to as a joke, sometimes taken seriously:

1. Don't do it.
2. (Only for experts) Don't do it yet.

While multithreaded programming these days is fairly common, I do think that the rules above still do apply perfectly to multithreaded (or multiprocessor, multicore, multicomputer or multiwhatever) programming.

The tools available today are great. They offer great tools to make sure that everything works well - but that's if and only if they are used properly. The problem is that people often are lazy, take shortcuts in same of speed and sometimes just don't have any idea what they are doing. Then, for a while, things seem to be fine. Then your load grows and bad things start to happen. Sometimes those bad things are masked by, say, DBA component where database hides the complex stuff. But at some point some very weird things start to happen.

Multithreaded programming is hard. Very, very hard. Even with good tools it is easy to make mistakes, or optimizations that will bite you later on.

I first worked with multithreaded programs back in 2000 or so. Most of the complexity was actually hidden from us (programmers, that is) by tools we used, so everything worked mostly nicely. Until of course they didn't. Mysterious random crashed everywhere and whatnot. Back then CPU used only had single core so it wasn't "true" concurrent processing, but OS task switching still provided enough fun times to curse when things went bad again.

Slowly I learned about system I was working on, details of the complexity the tools hid from us and multitasking in general. Multiple processes running concurrently, each running several threads on their own, with complex asynchronous communications between them.

And then, enter C system calls that affect entire system state. Like, say, setlocale. Never, ever use those in multithreaded program. Many C functions, like mentioned setlocale, or errno or many others affect global (process-wide) state which is shared by all threads. Like, say, one background thread using "C" locale for parsing text-based network data, and UI thread using user's locale for displaying same data. Damn problem took me weeks to find (very, very rare race condition - fortunately most of that time spent was hands-off so I could get other things done) and some more to fix - for some reason I can't really recall (possibly bad C++ standard library implementation) the usually preferred solutions were out of my reach or didn't work properly so I had to do less than pretty workarounds.

Lesson learned. These days, whenever I have to deal with multiple threads or processes I do pay real good attention to synchronization, always erring on side of caution. Someone might say I err on side of paranoia, with expense of (execution) speed, but the fact is that properly made synchronization is expensive - and lock-free programming (which is even more difficult than your typical multithreaded programming) in most cases is also safety-free as some corner case will come back to bite you at some point. Unless you really, really know what you are doing. Vast majority people don't, me included, so I don't even try to do lock-free.

So again. Somewhat adapted rules for multi-threaded programming:
1. Don't.
2. (Experts only) Don't try to cut corners.



Ei kommentteja:

Lähetä kommentti