Why are actions hard to test by definition?

Functional programming divides the world into actions, calculations, and data. Actions are hard to test by definition, and we explore why.

Transcript

Eric Normand: Why are actions hard to test by definition? Hi, my name is Eric Normand. These are my thoughts on functional programming.

I was thinking the other day that people often say that functional code is easier to test, especially when you're talking about pure functions. I totally agree. Pure functions are way easier to test than side effecting code.

I thought, because of the new perspective I'm bringing to functional programming where I've got these three groups, these three domains of things...I thought it would be good to see what that new perspective brings to the discussion.

When I was thinking about it, lo and behold, the definition of actions is exactly what makes it hard to test. Let's go over the definition.

https://twitter.com/ericnormand/status/1066723652828307456

Actions are anything that are bound up in time. This is usually talked about in the functional world as effects or side effects. These are things that are not part of the input or the output of a function that return value. The way I like to think about it is things that depend on when they are run or how many times they're run.

I have a little mnemonic. I say time or times. The time it is run or the number of times it is run. If it depends on when it is run, it's hard to test because it's not being run at the time that you want to test it.

When you want to test it is before it's time to actually run the thing. You want to test it, even on a different machine. You are testing it on your continuous integration server or on your development machine.

It's definitely not the time that you want to be running that. As an example, if you want to test that your thing is going to send the right email at the right time, that's really hard to do. You have to fudge time. You don't actually want to send the email to your actual users when you're testing. You want to send it to a fake email address, which is hard to set up.

Even harder to set up is we want this thing to go out on the third of the month. It's not the third of the month today. How do you test that on the third of the month it will go out without waiting for the third of the month? You want to test it now.

You have to set it up. You have to fake time. You have to pretend to your software like it's a different day. That is one part of being an action. That's what makes it hard to test is, it depends on when it's run.

If you're talking about something more subtle like doing global mutable state...Sending an email, that is obviously an effect but mutable state, let's say, reading from a global mutable variable, that is also an action that depends on when it is run.

You're going to have to set up that mutable state in whatever situation, whatever configurations, whatever state it needs to be in to simulate enough of the possible states that you feel like you have a good test. That's hard, too. You have to look at all the code and say, "What states could this be in?"

If you're not thinking about it that way, you're probably going to miss states. You're probably going to only test the easy cases and not consider that if other things are writing to this at different times then we're going to have a problem.

You're also probably not going to test...If you're not doing this right, a thing that you could easily mess up on is the multithreaded case where the thing is being changed as you're reading it. If you're in a multithreaded world and you're using global mutable state...

Let's say you have two variables you're reading from, two global variables. Those things are being written to by other threads while you're running. That could be a problem.

In your test, you would have to test for that. You would have to run enough tests to be sure, meaning enough iterations of your test, to be sure that you're sussing out all of the race conditions so that you can test it in your build server instead of in production.

https://twitter.com/ericnormand/status/1054749912154472454?ref_src=twsrc%5Etfw

That's time. From the definition, I'm literally trying to go for what the definition is telling us. The other thing is an action could depend on how many times it is run.

When I say that, you should be thinking, "Wait a second. I need to run my stuff a lot to be able to test it. I need to be able to run this for different test cases. I need to be able to run it on my development machine as many times as I want. Every time I make a change in the code, I want to rerun my tests. Every time I deploy something I need to rerun the tests."

If it depends on how many times you run it, that's a big conflict there. There are some functions that depend more on the number of times than others. That's one of the things that we talk about when we're talking about, "What does a functional programmer do with all these actions that depend on how many times?"

There's an analysis you can do to figure out, "Hey, does this thing really depend on every single time is different, is a different effect? Is it maybe idempotent? The first time you call this has an effect but the second time doesn't."

There's a bunch of stuff we can do, but still, that argument remains. One of the reasons why non-functional code is hard to test is because it's full of actions, basically.

Those actions depend on when and how many times they're run. How many times they're run, if something depends on that it's really hard to test. In your tests, you want to have multiple test cases. You want to have multiple...

...If you're doing example based testing, you want at least a few for each thing, for each unit. If you're doing property based testing, then of course you're going to want to generate hundreds, if not thousands, of example cases.

You're constrained here. It's going to make it a lot harder to test. What a lot of people do is they'll set up a little mock environment or something. All of that is a lot of work compared to if you had to test calculations, which, remember, don't depend on when and they don't depend on how many times they are run.

You can run them as many times as you want with different arguments. These are pure functions, basically. You run them as many times as you want with different arguments. No one's going to care. You can test the return value. That's all there is to test. That's nice.

You don't have to have it send an email and then check if you receive the email. How long do you wait because maybe email is slow? Sometimes it doesn't get delivered right away.

Anyway, the other thing is when it is run. It doesn't matter when they're run. That means they could be run on a different machine. They can be run during development. They should always give the same answer.

All right, that was me riffing on this idea that by definition, actions are hard to test. Maybe later we're going to talk about what we can do about that.

I've probably already talked about it in other episodes but not as explicitly in that context of testing. I'm @ericnormand on Twitter. You can also email me at eric@lispcast.com. Thanks so much for watching. I'll see you later.