What are concurrency and parallelism?

What are concurrency and parallelism? What's the difference? Concurrency is functional programming's killer app. As we write more and more distributed systems on the web and on mobile, the sharing of resources becomes a major source of complexity in our software. We must learn to share these resources safely and efficiently. That's what concurrency is all about.

Transcript

Eric Normand: What do we mean when we use the terms parallel and concurrent? What's the difference, and how are they related? Hi. My name is Eric Normand. These are my thoughts on functional programming. In the recent episode, I talked about race conditions and the conditions under which you would get a race condition.

You have two threads or two timelines that are sharing a resource. It's that sharing of the resource that makes something...It could be problematic. You could have a race condition, or it could be a thing about concurrency. Concurrency is all about sharing resources.

Resources could be anything. They could be a global variable. It could be a database. It could be a network connection. It could be the screen that you're printing out in your terminal. All these things are shared resources among different threads.

Concurrency is all about efficiently and safely sharing those resources. That's what it's about. You could also say your CPU is a resource. You can have a task-scheduler that allows you to share that one CPU among different processes. That's one way to look at it, but it's all about sharing these resources.

Parallelism is about increasing the number of things that are sharing the resources. That's increasing the number of threads. That's increasing the number of nodes in a distributed system. This is making something more parallel. Now notice it's very hard to make things parallel if they're not sharing safely and efficiently.

They are intertwined. This idea of concurrency, how we share things and parallelism, which is how many things are sharing the thing. How many things? You could look at it in another way, which is if it's a CPU. This is why the CPU one is a little hard because it's the thing being shared, and it's the thing that things are running on.

You could look at it like you are increasing the number of CPUs. That increases the number of threads that actually can run at the same time and not have to switch out to run because you can start actually having multitasking. You're increasing the number of things that are shared but also the number of things that are running at the same time.

It's a big corner case when you're sharing the CPU because that's also what you're running on. The main point is that concurrency is about sharing resources safely and efficiently. Safely, meaning without bad race conditions, race conditions that could lead to the wrong result. Parallelism is about, once you have that, being able to increase the number of things that are sharing.

Different concurrency mechanisms exist. Most of them have parallels in the real world. A simple example is if you want to share a bathroom with people. You have six people living in the house. There's one bathroom. How do you share that safely and efficiently? Well, you could put a lock on the door. That keeps other people out while one person is using it. That's the safety.

It's still private. It's pretty efficient because you can tell the door is unlocked. You could go in and just use the bathroom. Then you unlock it on the way out. It's pretty clear what the rules are. Imagine if you had something like 12, or 18, or 25 people in that house. Maybe a lock is not going to work anymore.

Maybe you're going to start seeing, for instance, someone who's had to go to the bathroom for two hours, and they just keep missing their turn. You're probably going to want something more robust to ensure that everyone can share that. It's not just the fastest people who can get to the bathroom.

Another mechanism we use all the time is a queue, lining up. You want to order food at the restaurant you get in line. They're processed in order, the order you get in the line. That means that everyone has a fair chance. That resource, the person who's taking orders, is going to be calling people up one at a time. It's fair. We intuit that it's a fair system.

This is a concurrency mechanism that's used all the time. Another one is something like a schedule. Instead of getting in line, you could put your name on a list. I know restaurants are doing this now in my city where you go up, and you put your name on a list. When the table is available, they'll text you. It's asynchronous. That means I can go take a walk while my table is being used.

Then, once it's free, I'm notified. Then I can go back. I don't have to just wait in line. It's a little bit more efficient. It probably lets you handle more people, I imagine. Especially since if it's going to be two hours before I can sit down, I'll probably go home and come back later when it's closer to the time that they're estimating.

These are all concurrency mechanisms that you can find in a computer. In the first case, there's a thing called a lock in programming. It's also called mutual exclusion or a semaphore. It's a way of making sure that only one thread is accessing that block of code at the same time. There's queues.

There's multiple implementations of queues where you put a value in the queue that signals the work that you want done. Then when the CPU is available to do that work, it will pull the next thing off of the queue and just process them one at a time. There's schedules or call-backs, things like that, like you have with the texting when my table is ready.

These concurrency primitives are really important. I have to say I'm not the only one who believes this, but concurrent programming, distributed systems programming, parallel programming is really the killer app for functional programming. People were talking about having hundreds, thousands of cores on our machines. We'd have to start programming those. That didn't really happen.

What did happen was we're now programming distributed systems all the time, so your cell phone has an app. That app is talking to the server. The server is talking to the database, talking to three third-party APIs. It's all distributed. Not every pieces are distributed, but most of them are distributed.

Functional programming does really well with this because you can have distributed concurrency constructs. I won't go into those right now. That's why I bring up concurrency and parallelism because that's really what functional programming does best. It uses immutable values, which are able to be shared between different threads with no problems, no race conditions.

If they're immutable, it means they never change, which means any copy I have you can share the same copy because it's never going to change. What's the problem? We can both read it. My thread can read it. Your thread can read it, and it's totally safe. Problem comes when you can modify it.

Then you start to get into, "Well, if you're modifying it while I'm reading it, what's going to happen?" that kind of thing. That's parallelism and concurrency. That's pretty much all I have to say on it.

If you want to get in touch with me, if you have a question, or you want to disagree with me, or give me some praise, or tell me I'm wrong, you can email me at eric@lispcast.com. You can also find me on Twitter, I'm @ericnormand with a D. Also, you could find me on LinkedIn and connect there. Don't forget to hit subscribe. I'll see you later. Bye.