What is associativity and why is it useful in parallel programming?
This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.
Associativity is an algebraic property that enables us to easily break up a job into smaller jobs, do the jobs, then recombine the results. Associativity is the essence of composition. In this video, we go over what associativity is, why we want to use it, when to use it, and 3 keys for making an operation associative.
Eric Normand: What is associativity? Why is it so useful in parallel systems? By the end of this video, you should have three keys to make an operation associative. My name is Eric Normand. I help people thrive with functional programming.
Associativity is an important idea. It lets you easily break up a job, do the work separately in different threads, and then recombine the answers without any trouble. I like to think of associativity as the essence of composition.
Let's get into it. What is associativity? It's an algebraic property, meaning it's a characteristic of certain operations or functions. What it means is grouping doesn't matter. Grouping, you can imagine as the parentheses in your mathematical expression. I could have something like a + b + c with no parentheses. It's ambiguous what the grouping is.
If I put parentheses, I could put parentheses around a + b or I could put parentheses around b + c. We know with plus it doesn't matter how a group it, which is why we leave out the parentheses because it doesn't matter. That doesn't matter is what associativity is. That's what the property that that is naming.
Another operation that programmers use all the time is string concatenation. Take two strings. Join them together into one big string. This is an associative operation. Imagine you had three strings. Just to make it easy to explain, let's say you have the string with the letter a, a string with the letter b, and a string with the letter c.
I'm going to concatenate them all together. If I put parentheses around a concatenated with b, and then I concatenate c onto that, versus taking and putting parentheses around b concatenated with c, and concatenate a on the front of that, you see I'm always going to get the same string at the end of the day. The string that I'm going to get is a, b, c. That's why it's associative.
Why am I even talking about this? It's a good question. Isn't associativity one of those math teacher things? I'm being a math teacher and teaching this thing that's unimportant in everyday life.
Well, it turns out that associativity is something we use all the time. We just take advantage of it, and we don't even think about it. It's something we learned as kids, as babies, operating in the real-world. It's just everywhere.
We learned it. It's become second nature. We don't think about it, we just take it for granted.
Associativity, why it lets you break it up, break up a problem? Let's say the problem is too big. You want to break it down into smaller jobs that you can get help with. You call some friends over, and you're all working on the same subproblems or different subproblems. It lets you recombine them later.
Let's say you had a long document that you needed to proofread. Let's say you needed to read the document and cross out any words that you don't want to be in there. It's a simple operation, but it's 2,000 pages long. It's due in two days. You just don't have time to do it.
You get some friends, and you take the stack of papers. You divide them roughly — let's say you got three friends. Now you divide it in four. It's three friends plus you. You divide it into four roughly. It doesn't matter.
You just remember what order. You had the first stack. You had the second stack. You had the third stack. I'll take the last stack. Now you just work independently, crossing out those words. Then you put them back together in the same order. The work is done.
It's associativity that lets you put the things back together, just like a string. I have a string with 1,000 characters in it. I could break it up into multiple strings. Then later, put them back together. That's what allows us to do that.
The order might still matter. In the case of this 1,000 page document, yes, the order still matters. You don't want the pages out of order, but we can all work independently. When we put the answers back together, we should have the correct result.
It doesn't matter how I break it up. I could give the first person...We said it's 1,000 pages among four people. I can break it up into exactly 250 pages each, four people 1,000 pages total so that's 250 each. I could be rough and give the first person 245 and the next person 255. It's not worth counting them. It takes just as long to count them as it does to do the work.
Because it's arbitrary, I can break it up anywhere. I know I can recombine it. I know this seems like a silly property to go on and on about. Imagine if you had this property in some work that you had to do. Imagine you had some big task to do. It might be too big to do on one machine or too slow to do on one machine so you want to break it up.
You want to use all your cores. You don't want to do it the naive way of just starting on page one, and going to page two, and then page three, and just one at a time. You want to break it up. Not all things can be broken up and put back together.
Having an associate of operation that lets you recombine the answers together is a requirement. It's not guaranteed that you have that. You have to make it that way. I like to think of associativity as the essential idea behind composition where you take two things and you put them together, and you make a new thing of the same type.
This is composition at its most primitive essential form. Just like you have two stacks of paper; you just put one on top of the other. You still have a stack of paper, but now it's combined. It's the same with two strings. Take two strings and combine them together, new string. It's the essence of composition. I promised I'd give you three keys for making an operation associative.
I hope I've made the case for why associative operations are really interesting. The "why" I gave is for parallel work. It also turns out that because they're the essence of composition, they're really nice for design. It means you can mentally, not just on your threads, but mentally break up the problem. That means you can work on it recursively.
It means you can encapsulate and say, "I'm just going to focus on this part because it's actually hard to think of how to solve the whole problem. I can break it down into something small enough to reason about in my head." It's nice for design. It turns out that a lot of things are much nicer if you have an associate of operation.
How do you do it? The three keys, really. The first thing you have to remember is that it is a binary operation. An associative operation is necessarily binary. If you have one that takes multiple arguments like in Clojure and most Lisps, a plus will take zero, one, two, three, four, etc. an infinite number of arguments, and plus is associative. Why is that? It's because the grouping doesn't matter.
You could write out the grouping tree, the parenthesis tree and nest them and stuff. It doesn't matter so we just put them all in one level. That's a syntactic shortcut. Plus is a binary operation. Addition takes two numbers and returns a number.
When it's binary, the two arguments have to be of the same type. Like I said, plus takes two numbers. String concatenation takes two strings. Another one is merging hash maps takes two hash maps. These are all associative. They take to arguments of the same type, and then they also return the same type.
That's what lets you nest them arbitrarily. That's what lets you group them arbitrarily. That a + b is going to give you the same kind of argument as c, so that you can combine it again using the same plus operation. That's the first key. Two arguments, same type, and the return type is the same type.
Number two, one way you could do it easily to make an operation associative is to use an existing associative operation on a data structure. You use the data structure that already has an associative operation. I've talked about, for instance, hash map merge. That's an associative operation. Try it yourself. You'll see that you can merge three things and group them arbitrarily.
If you build your data structure out of hash maps and you use merge when you're defining your function, your operation, it will be naturally associative. Number three, this is a big one and I hope I don't get too abstract with it. When you're giving work to your friends, let me give a little preface.
When you're giving work to your friends or I wanted to say someone that who's working under you, so you're delegating work. What you want, the ideal is to give them the entire job so that they can do it totally independently and not have to come ask you a million questions. That is, "Oh, do you have this thing? Do you have that thing?"
You want to be able to give them the work so that you can go do something else at the same time. You don't want to have to sit around and answer the questions. That same concept applies when you're dealing with associative operations. You want the string, so you break the string up into two so that you can work on it separately.
Each string, both of them, they contain everything that you need for that work. If you're counting the letter a in this string and it's too big to do in one pass, you break it up into two, and now one thread is doing counting a's in the first string and one is counting a's in the second string. That string has everything it needs. It knows its own link. It knows all the characters in it.
It's silly to think about like of course it has everything it needs, but do your data structures, do your things have everything they need. Here's an example of a time when you don't have everything you need. If you're a calculating an average, often what people do is they have a sum. Average is a sum over the count.
They have a sum and they loop through the array and add numbers into that sum, and then they get the count from the array and then then they divide it. Notice, you are doing these two things separately. You are summing separately from counting. What you want to do is have each value, each number in that array be a self-contained value, self-contained average.
This array of numbers, think about it this way. If you took that array of numbers and you broke it up arbitrarily and said, "OK, you average this part, I'll average this part, and then we'll combine them together." How can you do that? Because if the answer you get is an average of seven numbers in the operation and the average I get is an average of eight numbers, it's not apparent how to combine those.
If you know that it's seven and eight, then you can. If you don't know, if you just know the number, you cannot safely combine those two averages. What you need to do is instead of dividing it out and getting the average, you bundle the sum and the count into say, a new data structure. Let's call it just a tuple, because it's just a pair of numbers. It's a quotient.
Eventually, you're going to divide them, but before you do that you're going to leave it like this, as two separate numbers. How do you do this? Instead of having an array of numbers, you map over that array and turn each one into an average. It's an average of one number, so it's the number and a one. It is a tuple of the number and one of all of those numbers in that array.
Now you can safely split up that array, hand one half over to your friend, you do the other half. What do you do? You take two of these averages and you add that, you add the sum part, and you add the count part. Now you have a new tuple. It's a combining operation. It takes two tuples and returns a tuple. That's done. Now it's an associative operation.
That means you can split it up over multiple threads safely, recombine them arbitrarily. There you go. That's something that I like doing a lot. It's like you squinting and thinking hard, like, what are all the pieces of information I need in this value so that I can do this without having a variable over here that's got some magic or some partially completed calculation that I'm going to use during the calculation?
It's how we usually do it, but it's much better to move to the associative side. Those are our three keys. I'm just going to recap the whole thing. Associativity is an algebraic property where grouping doesn't matter. It lets you break up work, solve, get an answer, and then recombine the answers. I like to think of it as the essential nature of composition because of that idea. It's combining two things into a bigger thing.
The three keys of how to do this. Know that it's a binary operation, the two arguments are the same type and the return is the same type as the arguments.
You can use existing data structures with associative operations to build a new associative operation and bundle the values together, to make the work complete by itself. I want you to do yourself a favor to learn this idea. Go find existing associative operations and the data structures they belong to. I'll give you some examples. Just play with them in your own code to test whether they are associative.
It's very easy to test if they're associative. If you can't do (a + b) + c = a + (b + c), then it's not associative. You got to find some way of making it not associative. String, array or a list concatenation. These are all associative.
Addition and multiplication are associative, but subtraction and division are not. Map merge, merging two maps together. Function composition, if you've got that in your language, give it a try. Function composition, if you compose three functions, it doesn't matter how you group them. AND and OR with on Boolean, those are associative.
Set union is associative and set intersection is associative. Give that a try. Test those out in your languages. If you liked this video, if you liked what you've learned, if you found it useful, I would very much appreciate if other people could find this information useful, too.
Like, Share, Comment and certainly subscribe because then you'll get the next one which I guarantee will be close in usefulness to this video or this episode, so you'll get that one too.
If you have any questions, if you want to discuss anything with me, if you want to suggest a topic for the next episode, please email me, firstname.lastname@example.org. You can also find me on Twitter. I love getting in discussions on Twitter, @EricNormand, with a D.
Look me up on LinkedIn, and we'll connect up. Awesome. See you later.