What is a total function?

This is an episode of The Eric Normand Podcast.

Subscribe:

Total functions are functions that give you a valid return value for every combination of valid arguments. They never throw errors and they don't require checks of the return value. Total functions help you write more robust code and simpler code, free from defensive checks and errors.

Transcript

Eric Normand: What is a total function? By the end of this episode, you will know how this idea helps you write more robust systems.

Hi, my name is Eric Normand and I help people thrive with functional programming.

This is another important idea for math that we can borrow in our programs. It's used often in functional programming. We're not talking about totals functions. We're talking about mathematical, pure functions. These are calculations, they don't have side effects.

However, it does apply to actions, to side effects, if you want. You can take this concept, it's a flexible idea, and apply it to those things. We'll just be talking about it for pure functions, mathematical functions, in this episode.

All right, so let's get started. What is a total function? Quite simply, a total function has an answer for every combination of valid arguments.

That kind of asks a new question — what is a valid argument? We're going to say that in type languages it is very clear. If you've got static-type...

[audio skip]

Eric: ...arguments to a function are all values that belong to the argument types.

If it says int — whatever that means, in whatever language you use — then every possible int, if it's an int, then it's a valid argument to that function. If it say string, that means every possible string. Any way you can make a string, then that's a valid argument.

For untyped languages, it's a little bit more complicated. Usually when we're programming in untyped languages, the types are in our heads, and we have some unwritten, implicit, informal idea of what we mean as valid arguments.

That's what I mean, these informal rules about what are the valid arguments. That's what you've got in an untyped language.

If you can make them more explicit, if you can have some kind of checking on your arguments, some kind of preconditions, contracts, some kind of assertions, whatever you want to put in there to claim this is what's valid, all those things will be helpful for making this more useful.

That said, I myself, for my own code, when it's just me, I kind of know what the valid arguments are. I tend not to use too many assertions and things. Sometimes, I do. They're helpful sometimes. I just want to put that out there. We'll discuss this a little bit more, later in the episode. It's a hairy discussion.

Total function has an opposite. Just to clarify again, just to repeat, for every valid argument. If it says it takes ints, I can pass any int and get an answer. That's a total function. The opposite is a partial function. Maybe one of those ints doesn't work.

A classic example is division. They type of division, let's say, it takes two numbers and returns a number. Divides the two numbers and then returns a number, but you can't divide by zero, so it throws an error or it does something. Returns some infinity, or not a number, or something like that. Returns nil, null, whatever.

All those things are kind of like the function line. The function said I take two integers and I return an integer, but it didn't. It returned something else or threw an exception. It didn't even return. It broke when you passed a zero for the second argument, for the...

[audio skip]

Eric: We can see how this creates problems. First of all, robustness. If you have a divide by zero error in your program, it's going to fail in production and it's not going to be as robust. It's is not correct.

If you do want to make it correct, what do you do? You put an if statement before you divide, just to check. That makes more conditionals.

This is what the type system is supposed to be handling for us. We don't want them check that the arguments are valid. We want to just run division, and we want the compiler to find these problems. This is in a typed language.

Another example of people doing this — I see this occasionally and it really bugs me, because I really try to do total functions as much as possible — let's say you have a person class, and there are two subclasses. There's employee, and there's volunteer. Employees get a salary. Volunteers don't.

On person, there is a method called salary. That means you can call salary on employees and volunteers. When you call salary on employee, it gives you a number. It's like \$10,000 a year or whatever. Then on volunteer, it throws an exception. The message says something like, "Volunteers don't get paid." That's the friendly error message.

To make this total, you would need the salary method to return a number in every single possible case, instead of throwing an exception. That means that the volunteers' salary should return zero. That's the mathematical definition. I know there might be some reason why you didn't return zero there.

You threw an exception, because you're probably doing something wrong somewhere else in your code, and you want to signal, "Hey, we're trying to get the salary of some volunteer. That's probably an error." That means like, "We don't know if they're a volunteer or an employee. We've lost track."

You know what? I think that you're modeling it wrong. If only employees have a salary, the salary method shouldn't be on person, it should be on employee and have your compiler check that.

What a subclass is supposed to mean is that you can forget what the subclass is. You just care about the main class that it's all subclassing from, the superclass.

If you don't know that it's a volunteer, that's supposed to be OK. Just saying that this is an issue that can come up, even in something like an object-oriented model, where you have different methods that you think should be called. Maybe it's an indication that your method is in the wrong place.

That's one way to handle it, is to change the definition of what valid means for the arguments like we did with this person. It's not valid to call salary on just any person. It has to be an employee.

Another way that you can handle this is just make sure that you handle all the cases. Make volunteers' salary method return zero. That's it, just make it happen. Don't use nils and exceptions. Find a value that makes sense.

Zero happens to work in this case because their salary actually is zero. Sometimes, people will return a null instead of an empty array when there's no answers.

It normally returns an array of something, an array of answers. If there's no answer, it returns null. Why don't you return the empty array? In that case, that's still correct. Now, that function is total, and it works in a wider range of cases.

You don't have to check the return value. You can call length on it and know that there's zero things in there. Likewise, you can replace all those exceptions, trying to find some value that works within the already assigned return type.

That's not always possible. Sometimes, you're going to have to augment the return type. I guess that's number three — augment the return type. What does this mean? You could make the empty case explicit. Change your type so that it has an empty case.

This is in an untyped language where you're dynamically typed. You're a lot more flexible. Make nil explicitly accepted. I do that in Clojure. I say it's either a number or a nil. This is what numbers mean, and this is what nil means.

I don't want to overload nil with too much meaning. In this specific case, this specific type that I'm returning, number means this. Nil means this. I don't do that all the time, but sometimes I do.

Now, if I'm returning something like a variant, a tagged union, I might add a case to that variant. That's like no answer. You could use a maybe or option type, instead of using null. If you're going to throw an exception, change the return value to a Maybe Int.

If you had division, you could make division total by saying, "Division doesn't return a number. Division returns a maybe number." If you could pass zero in for the denominator, I'm going to return nothing. You're going to have to check that.

Optionally, if there's errors and things, you can make those part of the type. In Haskell, you would use an either, but somehow fold the errors into it. The way I imagine this is if you have a HTTP client, usually you can have it configured where it never throws an exception if the HTTP request doesn't go through. You can have it actually return the error as a response.

If the client got in touch with the server and the server had an error like a 404 or a 500, that is a valid response. It's an error, but it's a valid response. The error modes have been baked into the response type. There's codes, status messages, and things like that.

Even some clients — the good ones — will turn a time-out, which means they don't even talk to the server sometimes. It'll turn those into the same kind of error type.

The status 700, which isn't a real HTTP type of status type, but we'll call it status 700 time-out. Something like that. Then you can handle that in the same way with the same kinds of logic as you would handle even a valid response, a correct response.

The last way to handle these partial functions and make them total is to augment the argument type. In our case of division, instead of saying it takes two integers or two numbers and returns a number, we say it takes a number and a number that's not zero. That's a type — number that's not zero — and it returns a number.

Now, we have to make this new type called number that's not zero. Somehow, somewhere else, we will make one of those. When we make it, we check to make sure it's not zero.

Now, this function knows that it trusts this type. It's not going to be zero. It can unbox it and divide. It would call whatever internal division that's unsafe partial function inside.

Here's a warning. These last two things where you're augmenting the return type and augmenting argument type, they still need to check somewhere. They're just deferring the if statements. Before, you were doing unsafe division. You weren't checking if it's zero. You're going to have to catch that error.

That's like a conditional. Somewhere it's like, "Oh, if it throws an error, do this. Otherwise, you get an answer, do this." It's a branch. You have a branch. Now, if you turn it into a maybe — the return value becomes a maybe number — you still have to branch. You still have to say, "Is it a nothing or is it a just number?" You still have to do that branch.

It's not reducing the complexity, really. It's deferring it somewhere else. You still have all these other ways of dealing with maybe. For example, it's a functor. It's a monad. You can use it in other ways. It fits in with what you're doing elsewhere in a nice way maybe. It's a little bit better.

Likewise, if you augment the argument type, you still have to make one of these things somewhere else. You probably don't have one when you need it. You have a number, and you need to make a number that's not zero, however how you make one of those things.

When you make one of those, something's got to check if it's zero. What if it's zero? It's going to return a nothing. There's a maybe on the other side, too. You still have the branch. It's still in there. It's just now you have the type system on your side.

You can push all these if statements out to the edges. That's the metaphor that people like to use — that there's edges. When input comes in, the user types something in, check it for all these things — the zeros, etc.

Now, in the happy path where it's not zero and you pass it through, there's no checks. It's all good. When it comes out, you got an answer. You're pushing it to the edges. All the way, as close as possible to where the user input comes in or to where the user output goes out. Those are called the edges.

You're deferring it, pushing it out, so you're able to create a space in the middle where your functions are all total, where everybody's happy. Where you don't have to check things before you call methods on them. You don't have to worry about error cases, etc.

Of course, this is a design decision. It is up to you to decide how far you want to push stuff to the edges. Whether it's worth it in particular cases, etc. A lot of people push it really far to the extreme.

Some people don't mind checking if a number is zero right before they divide. Some people don't even check and they just let it throw there, and say, "Well, we're just going to have errors." That's fine. It depends on your software and what you need to do with it.

I do want to come back to this idea that types do help with totality. I'm going to use Haskell as an example. Haskell has several built-in functions that are partial. I talked about division. It is partial in the base Haskell language. Also, head, which gives you the first element of a list. If the list is empty, that throws an error. That's partial, this is the built-in function called head.

In general, Haskellers, they either look at that as a mistake or a wart on the language because they want total functions. They might rewrite head to turn it into like it returns a maybe. Something like that. They might call it maybe head or something like that.

Type systems do help in a number of ways.

One is they give you this way of defining what is valid so that the compiler checks it. We talked about how difficult it was in a dynamically typed language to even talk about what valid arguments means. Type system gives a definition of what a valid argument is.

It also will, in many cases, be able to check whether you've handled every case. If you have what's called a tagged union, or a variant, or you have multiple cases, different constructors for the same type, there's a flag you can set on the compiler that says, "Make it an error if I don't handle one of those cases."

That's the compiler being on your side, helping you make total functions.

Types can help with these total functions. The language can help as well, for instance in Haskell. Another design decision of the Haskell has nothing to do with the type system, but there's no null pointers in Haskell. That's really nice.

Null pointers were a mistake. [laughs] The inventor of null pointers admitted this. They didn't make the same mistake in Haskell. They learned from that, so there's no null pointers.

Nulls, especially in languages like JavaScript or Java where you have a type system...Like in Java you have a type system, but you can always put a null basically anywhere. The type system won't help you with that at all. That sucks.

As far as dynamically-typed languages go, I do want to say that this is one of those sore spots in dynamic languages where there's one way to look at it, which is that every function accepts any kind of argument.

It's true. [laughs] In a certain way of looking at it, I can go into my Clojure REPL and type plus, string and null. It will try to add them and it'll say, "Ah, these aren't numbers. Bleh."

It's true. I could run it. I did run it. I got an exception, so in some way every function is partial because you can always find some value of some type that wasn't expected that will throw an exception. It's true.

Except, that most...This is the same of object-oriented languages that are dynamic, you can send any message. There's no checks on what messages you're allowed to send. Very often, you'll get something...in JavaScript, you get undefined is not a function because it couldn't find that method in the object. Or you'll get method not found or whatever it's called in your language.

That's another ding where it says, "You can pass any message to anything." Everything's partial. Every method is partial because you can find some object that doesn't answer that message. Yes.

But there's another way to look at it which is that there is an intended type for every argument. Sometimes that type is really complicated, and sometimes it's not specified. Sometimes, it is not explicit, but there is an intended type for every argument.

Does that mean that they're all total? No, not saying that either. It's a lot more total [laughs] than with the other perspective. These are things we're used to dealing with in dynamic programming, dynamic-type languages. We're used to thinking the compiler's not going to help me. I got to remember the types.

Well, that's all I want to say. That is the extent that we have a good definition of valid in dynamically typed languages. There is an intended type. It doesn't mean it works everything.

For instance, I think in JavaScript, in Clojure, divide still takes two arguments, two numbers that still the intended type, and it sometimes is an error. It's still undefined for divide by zero. It's just the way it is. It's the same in Haskell.

Yeah, all right. Let me recap. It's been total functions. It's very useful for robustness if your functions are total. You have to do fewer checks. You're not going to get the errors.

If you're using only total functions, no errors, none. It's not possible. Maybe you get something else. Something like out of memory or stack overflow, something like that, but you're not going to get error because you chose the wrong arguments.

Total means you have an answer for every combination of valid arguments. The opposite is a partial function. Usually it means it throws an exception, or it returns null or nil, or some other error thing it does.

Why? It gives you robustness. It also improves your simplicity. You just have fewer conditionals within your center, inside the edges. A lot of times you're just pushing conditionals to the edge, but inside you've got this nice total, no errors, no conditionals. Everything works right inside.

I went over four ways that you can increase the totality of your function. The first one was that you could just move stuff around because things are on the wrong types, or classes, or you just designed it poorly, not using the constraints of the language properly.

The second one was sometimes you just mistakenly forget to handle a case. You thought it should be an exception, but really, within the stated type or the intended type, there is a valid response for that. Salary on a volunteer should be zero or it shouldn't even be on there.

The third one is you can augment the return type. If it turns out that there isn't a good response, you've got to change the type. You've got to make it a maybe. Maybe you can add cases to your type so that you're explicitly calling out the empty cases.

You could fold the error conditions, the failure modes, into the type, just like HTTP does. It uses the same response format for 200 responses as it does for 400 responses. Still got headers, still got a status code, still got a body. It's all the same.

The fourth one is to augment the argument type. You could say this is a number that's not zero. It's somehow somewhere else you're making sure that those are getting created correctly, but this function doesn't have to worry about that and that's nice.

When you do this, you can push these conditionals, checks, and things out to the edges. If you're checking stuff from the arguments it's going closer to user input and you're pushing the return type checks further out into where it's getting output.

If you enjoyed this episode, if you learned something, there are a lot of other episodes. You can find all the past ones at lispcast.com/podcast. All the past episodes have audio, video, and text transcripts. Whatever medium you're into, we've got it.

You'll find links to subscribe, including the video, the RSS for the text, also the RSS or iTunes link for the podcast audio. You can also find social media links at that same address lispcast.com/podcast.

Please get in touch. I love discussing this stuff with people that's why I'm doing it, broadcasting, trying to find people who I can relate to out there. It's a lonely world. Got to find people that like the same stuff you like, have a lot in common.

My name is Eric Normand. This has been my thought on functional programming. Thank you for listening and rock on.