Do locks slow down your code?

This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.

Subscribe: RSSApple PodcastsGoogle PlayOvercast

Yes. Locks slow down your code. But they enable your code to be correct! It's a tradeoff, but who would ever trade correctness for a little speed? In this episode, we look at the tradeoff, how to make locks less of a speed tradeoff, and some alternatives.

Transcript

Eric Normand: Locks slow down your code. By the end of this episode, I hope to give an engineer's perspective on the time trade-off of locks.

My name is Eric Normand. I help people thrive with functional programming. This is an important topic because locks are one way, an important way to achieve concurrency. That means sharing of resources between different threads. We need to understand their trade-offs.

This idea, discussion topic was something that was brought up by someone I was having a phone conversation with.

I was talking about how locks are an important concurrency mechanism and he brought up, "Yeah, but they slow down your code." It struck me as funny to bring that up because I did learn that in school. I remember specifically in class when we learned about locks at university that that was a primary thing that was of concern like, "Remember, these are going to slow down your code."

I don't know why people talk about that, because after years of actually building systems using locks and other concurrency primitives, it is just so clear how much more important correctness is than that kind of speed.

It might be an old-fashioned idea because maybe they weren't so fast. Computers weren't as fast back then, and maybe locks were considered pretty expensive, and so that university classes often lagged behind the technology a little bit. Professors are just teaching what they learned in school, and this person is older than I am so he probably learned that even before. His professors were even older.

It's one of those things that I think we have to get over. That there's nothing you want to trade-off for correctness.

If your program doesn't do what it's supposed to do, then it doesn't matter how fast it is. It just doesn't make any sense. That's why it struck me as funny. I just wanted to bring that up and re-emphasize it. Now, that's not to say that they don't slow down your code. They definitely do.

There's a few things that we have to talk about with that. Locks definitely are slower than letting your threads run without locks. There's no question about that. The problem is without locks or some other kind of concurrency primitive, you run the risk of what are called race conditions.

I have a whole episode on race conditions. Race conditions generally, in a nutshell, mean your threads are not sharing nicely. For example, they could be using the same variable to store stuff in, or the same mutable data structure. They're reading and writing at the same time over each other.

It's like you're trying to all color on the same paper at the same time. You're bumping into each other, and you're like, "Hey, I wanted that to be blue, now you just colored it red." You need some kind of way to share. I use this simplistic example it's kind of a kid example because it's like learning to share as a kid.

Like it's my paper right now, I'm going to color it. When it's your turn, you can color it. Obviously, that's going to slow down the kids. They're going to have to wait. At least you get some sense of order to the drawing. In some sense, it's going to be more correct.

There's one thing that you could do in this scenario to reduce that trade-off, reduce that cost of time, which is to reduce the amount of time you spend inside the lock. One thing that we do in Clojure is, because we're dealing with immutable data structures, you can actually calculate the value that you will store in the locked variable ahead of time.

Normally what you would do if you wanted to operate on some data structure, let's say it's immutable data structure, but you want to do it safely. You want to do it correctly with multiple threads.

You would create a lock that all threads they have to acquire the lock before they can modify that data structure. One thread is going to have the lock at a time. You try to get the lock, if it's already locked, you have to block, you just going to wait.

If it's not logged, you get it. Now you can modify the data structure. When you're done modifying it, you release. You might do 5, 10, or 100 operations on that data structure. The longer you go doing operations inside that lock, the more other threads are going to have to wait.

What you want to do is reduce the amount of time that you spend inside that lock. One thing that you could do is, instead of doing 10 operations inside the lock, you could, for instance, make a copy of the data structure without getting a lock or maybe you get a lock just long enough to make the copy.

Then you operate on that copy. It's your copy. You can do whatever you want. You don't need to lock. It's not a shared resource. Then you grab the lock and swap out that data structure for the one that's in there because you have the lock, you're allowed to do that. You're allowed to make modifications to it.

Now we do this in Clojure because we have immutable data structures that have built-in copy-on-write semantics. We don't even think about the copy. We are making a copy. We can also guarantee that nothing else is modifying it while we're reading it. Once you have a pointer to it, you'll know it's an immutable thing.

We have this built into the language. It's one of the things that makes Clojure really nice. Now, here's the thing, when you get that lock, you have to make sure that it hasn't changed since you read it. You made this copy. You had to read the data structure to make the copy. Has the data structure changed in another thread since you read it?

This is what's called compare-and-swap. You make this copy, you modify it. Then when you grab the lock, you have to check, "Hey has it changed since I read it last? Has some other thread changed it?" If it has, you got to start over. You're like, "OK, I'll read it again make a new copy and make modifications to this new thing."

If it hasn't changed, you can just set it right away. What this does is it lets you have a much smaller window of locked code, a much smaller mutual exclusion section. It's another word, mutual exclusion and lock are pretty much the same. It just makes sure that you are spending much less time in that locked state.

There's another trade-off here which is that you're going to be doing work on your threads. This is instead of the threads simply blocking and not doing anything. It's a trade-off that doesn't matter. Except there will be a little bit more heat your threads are working, so they are taking up CPU from potentially other threads working, so there is some trade-off there.

I also want to talk about how locks are error-prone. There's a lot of possibilities there for say, forgetting to acquire the lock before operating on it, before operating on this shared resource. There's stuff like how do you make sure that you release the lock when you're done?

You could forget to do that. If you've got multiple locks, you've got to lock them in the same order and every single thread. It becomes actually a pretty hard challenge and a lot of bugs in.

It's one of the reasons why even with locks, people think multi-threaded code is very difficult. It's because you've got these challenges now with reasoning about, "If I have this lock, what can I do? I need two locks. What order should I get them in?"

It's actually a pretty hard thing to reason about once you've got a real sizable system. In Clojure, we don't generally use locks themselves. We use primitives that are often built on top of locks.

The locking has been solved once and for all, and we can think at a higher level. We have something called an atom. It's probably the most commonly used Clojure concurrency primitive, and it does that compare-and-swap.

It's built on a Java class. I think the class is called atomic compare-and-swap. All it does is it stores a single immutable value and gives you an interface for modifying that value with retries if something else has modified it while you were modifying it.

It's probably got locks down...I haven't looked at the Java implementation, but it probably has locks down at the bottom so that you can do this compare-and-swap. Meaning has this value changed or has this pointer changed since I read it and made a copy?

If it has, I'm going to retry. It means I'm going to read it again and make a new thing. All of this is done with...You don't have to deal with the locks. It's a higher level of working.

Another common concurrency primitive is the queue. Instead of racing to grab the lock, and then whoever acquires lock first gets to go first, you could put the work into a queue and have a single thread operating on that shared resource at a time.

This is another way of sharing. It's taking turns, or you'd line up. This works a lot better when there's a lot more contention. Just imagine a crowd of people all trying to use the same...

I like to think of it like kids because kids don't know the rules sometimes or they break the rules. They're trying to share this toy. A few kids can work through the turn-taking. Whoever grabs it first, they get to play with it until they're done. Then they pass it on to the next person, and then that person gets to play with it.

Somehow, they manage. Once it gets to 10, 20 kids, some kid is going to be like, "I haven't played with it in like an hour. Please, can I?" This whole grabbing it, whoever gets it first isn't going to work anymore.

Now, you have to line up. You get in line. Whoever's at the front of the line, you get to play with it as long as you want. Then when you're done, you want to play with it again you've got to go back at the end of the line.

It has a fairness to it that you can ensure that things happen in a certain order, and no one goes without playing with the toy. This queue, it changes the nature of the shared resource.

That toy, yes, it's being shared, but only by one thread at a time. One thread runs that shared resource. Often, the way it's implemented in software is you have one thread that's called the worker thread.

Its tasks get put into the queue that operate on that shared resource in the worker thread. By putting something into the queue, you're given a promise or a future that will have the result of your work being done.

The thread can continue doing some other stuff. Then when the worker thread is done, it will put the value into the feature. Then the original thread can continue working on it.

Really, only one thing is accessing that resource at a time. You could say it's not sharing that resource anymore. It's only one thread using it. What becomes shared is the queue.

You get smart people. They sit down. They harden this queue. They make sure it works concurrently, and all the locks are in place. That queue becomes the shared resource because the threads are going to be adding tasks to that queue so that they're all adding them at the same time in parallel.

They're sharing. There's some discipline that's going on with the locks for how things get put in in order. I just wanted to go over these different ways of getting at a higher level than locks, because locks are just very hard to reason about.

When we're doing concurrency, it's all about shared resources. Locks are...Imagine it's like sharing a bathroom. If you have a couple of roommates, a little lock on the door of your bathroom can really help you share this bathroom safely.

You try to open the door, "Oh, it's locked. I'll come back in a little bit and try again. Someone must be in there." If you had 20 people sharing a bathroom with locks, everyone is constantly knocking on the door.

What if someone takes a long time in there? It becomes a mess. It's not enough, so you need some other system like a queue. Put your name on the board.

When someone leaves the bathroom, they're going to call the person's name on the board and whoever that is can come. You don't have to just wait and stand in line. It's also got an order to it so that things work out.

Recap. Locks do trade time for correctness, but correctness is something that you need. It's not like you're going to say, "Yeah, we'll just be wrong half the time, but at least it's fast." You just never do that.

You do need to understand that once you go concurrent, meaning you have multiple threads sharing this resource, you are going to trade-off a little time. There's no way around it.

Just like if you were sharing a bathroom, there are going to be times when you need to use the bathroom, but someone's in it, so you have to wait. That's just going to happen.

However, there is an efficiency there, because a lot of times the bathroom isn't being used. Why have a bathroom per person? It doesn't make sense. It also lets you scale up faster. Because the bathroom isn't being used most of the time, you can add more people and — though there might be some bottleneck times — most of the time it will work out, if you've got a good system. It's good for scaling.

The main thing you want to do with locks is reduce the amount of time inside the lock. That's the main thing. If you can get people to go to the bathroom faster, for instance, you've got your toilet in there, then after you use the toilet, you go to the sink, you wash your hands, then you dry your hands, then you open the door.

You could say, "Well, we're going to move the sink outside of the bathroom. Doesn't need to be that private." Now, we're doing less with the locked door.

People can be using this shared resource more effectively. We're doing less inside of the lock. I also want to say, because locks are so error-prone, they're the main reason why people think concurrent programming is difficult. You should look into other primitives that are probably built on top of locks but have a better interface.

They're less error-prone. They're actually, I would say, less abstract and more specific. Locks are basically a general purpose tool. Just like in your house, a lock is a general purpose way to keep a door closed from one side. What you want is something much more specific to your bathroom and how to share a bathroom, or how to share a kitchen, etc.

When you get more specific, there's less you have to think about. There's less that you have to do as a programmer to make sure it works, less discipline. You can encode the discipline in code. That's what these concurrency primitives do. I'd mentioned to you compare-and-swap. In Clojure, we call that an atom and queues.

Do yourself a favor. Go, research some concurrency primitives in your language. If you happen to be into Clojure, I have a big list of them for Clojure. Just search for Clojure concurrency. I am in the top, I'm not the number one right now, but I am up there in the top rankings. It's a purely functional.tv article.

Do me a favor, please. If you found this valuable, you should subscribe because then you'll get the next valuable episode that I'm doing.

I also like to get into discussions with people. If you have question, a comment, disagreement, agreement, I appreciate all of it. I read all of it. Email me at eric@lispcast.com. You can also message me on Twitter, just at mention me, I am at @ericnormand with a D.

I am also getting into LinkedIn. Find me on LinkedIn, Eric Normand, with a D.

Awesome. See you next time.