How do we represent relationships in functional programming?

This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.

Subscribe: RSSApple PodcastsGoogle PlayOvercast

Functional programmers tend to model important relationships using data, while OO programmers tend to represent them with references.

Transcript

Eric Normand: How do we represent relationships in functional programming? Hi, my name is Eric Normand. These are my thoughts on functional programming. There's certain relationships in the real world that we would like to capture into our information system so that we can process them, derive other information from them, etc.

One of those things is, for instance, parent-child relationship. If we're making a genealogy database, we want to capture that parent-child relationship that is an important factor in genealogy. That's the central factor.

In an object-oriented language, an object-oriented style, typically what you see is relationships are represented with pointers, with references. This object has a field that refers to another object. By having that reference, you can send messages and things. That reference very often represents some relationship like, "This is my child."

The main problem with using a pointer is that it is unidirectional. Parent/child goes both ways. If I'm your parent, you're my child. The child needs a pointer back to the parent to be able to maintain the relationship. You have two things. If they're mutable, this is the problem. You have two things that you have to keep in sync.

In functional programming, what we tend to do is to reify the relationship into some data. It's not like you couldn't do this in object-oriented programming. I'm just saying that typically you don't see this. What you do is you reify it into a piece of data.

Take a tuple, let's say a two tuple, also known as a pair. You have a pair of pointers. It's immutable, just to make it easier to reason about. The first element of the pair is the parent. The second is the child. Let's say you're not even using pointers. You're using some other identifier that's more stable, like the parent's ID and the child's ID.

You have this relationship and notice it's got both ways. You can ask for who's the parent and who's the child. You can also do all sorts of other stuff with it, which is pretty cool, too. If you have a bunch of them, you can now consider this a relation in a relational algebra.

You can join and recursively find grandchildren, stuff like that. You could do this in object-oriented programming. Typically, what happens is in object-oriented programming, we tend to just focus on the one class and say, "This is what it needs to know."

We're not elevating the relationship to its own object. We're talking about a person has a parent and a person might have children. You have these pointers and that's it.

Let me give another example that shows the mindset difference. This is an example that I remember from school. I bet other people have had this in school, too. The assignment is to develop a class model for a student registration system. Students need to register for multiple classes, and those classes need to have a roster of the students enrolled.

You noticed it's got both directions. That students need to remember their classes, and the classes need to remember their students. The challenge is how do you create a message protocol between these two classes such that it maintains both of the pointers because you have some recursions?

If you say, "The students registers for the class." The student has a register method you pass in the class. It sets its own pointer. Then it calls a method on the class. [laughs] The course, let's call it. The course class. It tells it, "Now, put me on your roster." You have this double message pass thing going.

Then that one will set the courses, will add to the course's list of students. That way, you maintain the relationship. If you unenroll, you have to do the opposite. You have to remove it and then tell the course, "Remove me." I remember doing this assignment and thinking, "This is very convoluted."

At that point in my experience level, I felt like, "Well, this is just the way it's gotta be. There's going to be some convoluted stuff in programming." When I started learning functional programming, what became very apparent was that this was just looking at the problem inside out.

What the important thing that you're trying to capture that is in the assignment is the relationship. It's not the course and the student. The important part is the relationship.

For instance, if you were to look at how you would implement this in the real world without a computer, let's say with just pen and paper, there would be no course you could talk to. There's no course to send a message to. What there is is a book somewhere. Somebody in an office is sitting there with a book.

Now, I'm imagining how you could implement this. A student comes in and says, "I want to take English 101." They come into the room. The secretary or whoever manages this book, flips the book open to the page for English 101, takes down your student ID, and that's it. You're in the course. You're on the list.

Then maybe if you want to see your schedule, they could maintain a schedule for you where you give them a card. It's not about the student having some power to send a message to a course. It's about the data in this book. That is the canonical source of that relationship.

What we do in functional programming is instead of simulating a person and a course, we simulate the book. Maybe even enhance it. Now, we've got infinite space and what have you.

How do we do it? Even in an object-oriented language, if you were to take this approach, what you would do is you would make a class called many-to-many relationship. You could just make a class called course registration manager or something like that. [laughs]

If you notice, the general case is this is a many-to-many relationship. A student has many courses. A course has many students. It's a many-to-many relationship. You want a set of rules and operations that follow those rules to be able to add people and add courses in such a way that it maintains this relationship.

It would probably just be pointers in the same way. We're still in the computer. We're still dealing with pointers, but maintained in such a way that they are put together, like in a tuple. You can see that this is the course. This is the student. You could query the class.

The many-to-many class would have a method for give me all of the courses for the student, and give me all the students for this course. How do you represent a student in a course?

Here's the thing. In that book, they don't create a student. They don't draw a picture of a student or something like that. They just take down your ID. That's enough. That's all you need. That will get you back to the student when you need to.

That's what I would put in that many-to-many relationship. It'd be the student ID and the course ID. That's all you need. You're maintaining that relationship and you don't have this convoluted system where you have to communicate between objects to maintain this double-sided relationship. [deep breath]

That to me is one big difference between a functional mindset and an objectorial mindset. We are trying to represent that as data and turn it into something that you can query, you can analyze, you can combine with...if you said, "Well, I want this." Just think about it.

https://twitter.com/ericnormand/status/1079045176511918080

You don't want to have just one book because only one person...you have a long line when it's time to register. You want different people to be able to register people. Then you deal with the problems later like if the class is too full, there's three extra students because they weren't coordinating.

Well, you can just look at the last ones and you say, "Hey. We're sorry. You thought you were registered but you weren't, right?" That's just a way you have to deal with distributed systems, but you could make it so that you can have an operation for combining different many-to-many classes into one.

This one maintains some of them on this node in the network, on this node in the network for maintaining other ones and then you combine them into one. That's your canonical source. [deep breath]

OK. Maintains relationships not with pointers meaning references to other objects that you have to...you're obviously using pointers, right? Because it's in the computer and that's just how we do things with computers. It's point to memory, but you are maintaining the relationship as a thing itself, not as an implicit effect of calling certain methods. Does that make sense?

In the student and...am I being clear? In the student and the course example, when you make a student class and a course class, there's a point where only the student's pointer is set and the course's pointer hasn't been set yet and it's still waiting to send that message and they have everything work out.

It's not atomic, for sure, but it's only implicit based on the order that thing's happening, right? It's not guaranteed by the structure of the data that you're using. It's guaranteed over time based on just the order that thing's happening.

OK. Please let me know what you think. This is one of those examples that when I realized what happened, what I was thought and how it doesn't really correspond well with the real world.

It's one of those examples that was really good for me to rethink and revisit my classes in college when I was thought this stuff and root out that idea that you want to have a class for everything and that you should maintain pointers as relationships.

This was really one of those example ones that I had to solve and I did caught a thing where I just iterated on it. I iterated on the problem in a functional way over and over until I rooted it out, all of that objectorial thinking.

Maybe you want to do that too. Maybe you want to see how you would implement this. Maybe naturally, like the way you are now and based on your training. Then also really go deep and try to get it as functional as possible.

All right. My name is Eric Normand. Please let me know on Twitter what you think. You can also email me at eric@lispcast.com. I would love to some reviews on iTunes if and on comments on YouTube, wherever. Just subscribe and you'll be notified of new episodes. I'll see you later.