How can you work with a JSON value if you know nothing about it?

I have talked about the difficulty of typing certain JSON values coming from some APIs. The JSON is just very complicated. When I do that, I often get this question "how can you work with a JSON value if you know nothing about it?" The question is rhetorical. Of course you can't do anything if you know nothing about it. But we do know a ton! We just can't (or it's very difficult to) encode what we know as a type.

Transcript

Eric Normand: How can you work with a JSON value if you know nothing about it?

Hi, my name is Eric Normand and these are my thoughts on functional programming. I wrote this article back in October of 2017 I think it was. Yep, end of October 2017. The article is called Clojure Vs. the Static Typing World. It was a response to one of Rich Hickey's keynotes.

I talked about my experience with Haskell and working with JSON APIs and how difficult it was to do that from Haskell. You can read the article. I'm not going to go into that here. I got a lot of readers of this article. It went viral because it kind of had a trolly nature. I think people saw some trolling in it. I wasn't trying to troll, I was just trying to share my experience.

JSON APIs are difficult to type

One thing I talked about was how difficult it was to type some JSON APIs because the people who are producing that JSON are not static typers. You could be doing a JSON API from a system written in Ruby, a system written in JavaScript, a system written in any number of dynamic languages that don't follow the rules that Haskell wants you to follow.

You could be getting back a JSON that has this really complex structure to it. There's rules you have to know that aren't even written down anywhere, or they're kind of implicit in the documentation but you have to read between the lines. Mostly you have to get a response, a piece of JSON, look at with your human eyes and see what you can do with it.

I'm talking about some values being optional, meaning they are either there or they're not. Some values having different types, sometimes it will be a string or a list of strings. The string will represent the one case. The list of strings is multiple. Maybe even it's missing because it had zero.

That's three different types, really. Sometimes the value of one of the key values in there will determine what else will be in there.

Let's say if we're returning an add response then you'll see a key for this. If you're returning a subtract response, you'll see a key called this other thing. Then that one might have a complex type too that has three possible cases in it, and it's all based on the type.

When you look at the combination of all of these things with deeply nested JSON structures, the difficulty of typing that, especially since it's not even well documented, it's just an enormous task. The fact that it can change all the time and so you have to deal with the type changing over time. Your client has to change, but you also want to be able to deal with the old JSONs that you got.

It's a big task

I'm not saying it's not possible to type it, but the question is how much effort is it going to be and are you going to get it right? You're just continuously complicating this type to make it match the actual JSON than you're getting.

That is something, just from personal experience, that happens. I did not know how to type these things without spending all my time on them.

The type system is very good when you can apply a lot of rationality to the thing. You say, "Well, we're not going to deal with this case because if we make this type so complex, it will be hard to type, so I'm going to keep it simple. Then look, my solution gets simple."

What we're talking about is stuff that might even start very simple and easy to type. Over time, the JSON API just gets added to and it changes. Then they go through a couple of rushed sprints and the thing gets messy. It's like they have some library that makes it easy to deal with in their language, but then it doesn't make it easy to type.

There's just all these things going on on the other side of the API, on the server side. You're writing this client that you're not in control of that don't help when you're trying to type it. Just personal experience.

In JSON everything can be missing

I've found that I would rather just have the RawJSON and be able to write a bunch of if statements. Just ignore the parts that I don't care about, dig into the parts that I do and do my best with them. Otherwise, I would just be typing all the time, writing types, modifying types.

I made this point. I don't think I made it as clearly as this because I felt like it was kind of tangential. One of the responses to my post was...well, yes. Let me back up. One of the responses was, "Well, isn't that what something like a maybe is for, where you can have a missing value?"

The thing is, in JSON everything can be missing. You don't know what they consider something that is optional. Do you put a maybe everywhere? It's not as easy a solution as saying, "That's what maybe is for."

The solution requires modeling this other API and its behavior, using the tools you have, and maybe might be one of them. You have to decide this is something that might be missing. This is something that is always going to be there, so I'm going to put a maybe here, and I'm going to leave this as an un-maybe type.

That is the hard part. It's not how do I represent something that's missing. That's easy. Then I get this other thing. This is a really key point because I'm going to answer the question. I get this question all the time and it really speaks to the kind of thinking that I think is what I'm trying to point at as the problem.

Static typing is the future

Well, let me say this before I get into this. I think static typing is the future. I think it's going to be very important in the future. We should use as much processing power as we can to guarantee our code is correct. I like static typing. I'm very happy that I learned Haskell because I learned static typing in a very good environment, supportive environment.

https://twitter.com/ericnormand/status/1081279865675046914

I'm thankful all the time because it helps my Clojure programming. One day, I will use static typing more than I do now. I like static typing. I want to heal this divide so we can talk, right?

I feel like that we in the Clojure side have something to contribute and we need to work together. We need to learn more static types, and the static-type folks need to learn more of what we're doing in the dynamic type and why we choose dynamic types for some things.

https://twitter.com/ericnormand/status/1044602850435846144?ref_src=twsrc%5Etfw

I know a lot about JSON value

Here's one of the questions that I get a lot that really shows the issue. I'm going to address it directly. How can you work with a JSON value if you know nothing about it? I was saying, "It's hard to type these JSON values coming from random APIs." They say, "Well, how do you even work with something if you don't know anything about it?"

https://twitter.com/ericnormand/status/1070347478556385280

Here's the answer. I know a lot about it. I just said I know that this thing might be missing. I know that when this value is this, when this key has this value, I have to look at this other key, but if it has this key, the problem is, that's not easily typeable.

It's simply not easily typeable. I know a lot about it. It's just that the type system that I have in Haskell doesn't let me express those things very well. That is the issue. This question of, "I know nothing about it," this is the problem. You see types as all or nothing. "If I can't type it, I know nothing about it." That's not true.

If you can't type it, there might be a ton you know about it that's just not easily encoded in your type system. This kind of rhetoric of, "If you can't type it, then it's not worth anything," is not helping you as a static typer. I'm addressing the person who [laughs] asked this directly. This is not helping you.

Clojure actually provides a model

There are things that you know that cannot be expressed in the type system. There are things that are not easily expressible, or maybe they are expressible but the type would be 14 pages long. These are real concerns. We can't just throw something out. We can't just say, "I'm not going to use that API because I can't figure out how to type it." You have to work with it.

This is one of the things I was trying to get at in my article, that Clojure actually provides a model. That model is simply, "Just give me the JSON. I'll deal with it. I just need to be able to parse it as JSON and I'll do my best with it." That's really the model that Clojure gives you. It's also pervasive in the community. When another Clojure programmer does something, that's what they do.

https://twitter.com/ericnormand/status/1040677583443111936?ref_src=twsrc%5Etfw

This could've changed since I've worked in Haskell, but when I was there, my experience was people spent a lot of time writing out these complex types. That was the answer. It's great. It's like they're excited about having a type system now where they can actually express these things. That's what it was. They saw the positive in it.

JSON is typeable

This is what I was trying to get at in my article. I think that it's a little ambitious to use the type system to represent an important part of the JSON API ecosystem out there. Some of it is going to be easily typeable.

Then, I continued on, saying once you have it just in JSON, because JSON is typeable. It's an ADT. It has all these cases of it's a string, or a number, or a null, or a Boolean, etc. Once you have it like that, Haskell doesn't make it easy to work with these things, like digging into it and pulling out a value at a certain keypath.

Now, of course, there's lenses. I know. Those were very, very new when I was working in Haskell, and we did not adopt them in our code base. What I've seen is that they could work. I don't know. I don't have enough experience with them.

I don't really like the error messages they give you when you've got something wrong, but it seems like, in all possibility, that they've really solved the problem. I need to look into them a lot more before I make a judgment on them.

A challenge that Haskell needs to deal with JSON APIs

At the time, it was very difficult working with this because each keypath returned a maybe. Each step on the path returned a maybe and then you had to deal with this, "Well, what was missing?" I'd just get a nothing back.

It just was not as easy as working with in Clojure, with this JSON that I know some stuff about, and I'm just going to try that stuff on it. I didn't mean to make any kind of rifts between static and dynamic. It was simply a challenge. It was simply a challenge that Haskell needs to be able to deal with these JSON APIs from a language standpoint, and also from a community standpoint.

It's even been almost a year since that article came out, and maybe Haskell has even changed since then. I don't know. I should actually go and explore some more, I guess.

Please, if you know of Haskell is better than it was then for this kind of thing, let me know. I'm @ericnormand on Twitter. Thank you so much.