What makes some API's become DSL's?

What causes an API to cross the line into becoming a DSL? Is it really a 'I'll know it when I see it' situation? I've been searching for an answer for years. And I think I found it in a paper I read recently for this podcast: Lisp: A language for stratified design. In this episode, we go over the main factor that makes an API a DSL: the closure property.

Transcript

What makes some APIs become DSLs? By the end of this episode we'll have a good explanation of the power of DSLs and the thin line that divides them from ordinary APIs. Hello my name is Eric Normand and I help people thrive with functional programming. So I was reading the stratified design paper in a few episodes ago and I've also asked myself this for a long time. Where does the power of a domain specific language come from? Why is it something that seems to have some kind of transcendent property? It gives you this leverage, this huge amount of leverage. You go from a low level to a high level very quickly and it seems that there's a lot of discussion online, meaning in blog posts, essays, people trying to come to grips with what's the difference between a normal API and a domain specific language or DSL. And I think after reading this paper I have a really good answer. It's you know finally I figured it out. So I'm gonna share my idea here and I think it's a good one. Of course in all humility I know that other people have thought a lot more about this than I have and a lot smarter people so please point me to other explanations of this but I think this is it. Okay so let's imagine an API just like a normal API and let's say it's for drawing graphics. So this API might have a function or some method on it for setting the color of a pixel or drawing a square or drawing a circle and changing the color you know those kinds of those kinds of operations. And you could look at the docs and you feel like okay this has a logic to it and I can kind of figure out how to draw whatever I need to draw. Right but it doesn't feel like a domain specific language. It just feels like I have these commands I can run and that's it. I just have to figure out how to turn them into a sequence of commands that will draw what I wanted to draw. But a DSL feels a lot different. So a DSL would be something that would let you describe the picture that you want drawn not as a sequence of actions but in some other way that feels much more expressive and like your you know I don't want to use the word declarative. I have a whole episode on why I don't like that word but you are you just feel like it's much more expressive. Now in the paper list but a language for stratified design the authors talk about two things that I think are very important for like what causes this difference. You know this DSL it is an API it is a way to interface with this code but it somehow feels more expressive and why is that there's a line that you cross at some point. Okay so I'm going to try to back up and describe these two properties of languages of the stratified design that cause it to cross that line. People have talked about this line you know they talk about oh it's some intuitive line that you kind of cross and I know it when I see it but this does a much better job this paper does a much better job of giving the actual principles in it. Okay so we have to look at what is a language and in structure and interpretation of computer programs and it is repeated in this paper called list but language for stratified design. The authors define a language as consisting of three parts. There are the primitives this is the stuff that the language gives you. There are the means of combining okay means of combination that's taking things in the language and putting them together to form new things. So this might be writing a function or building up some other name like a class and then there's naming which is where you give something a name so you can refer to it later. And then naming can give it a higher meaning. Okay and so they define stratified design using these three terms. So you have let's say you have three layers usually you have more but let's just focus on three layers. So you have layer A and it defines some things right whatever doesn't matter there's some things to find in that layer. Then on top of that you got layer B and it uses the things that are defined in A right below it as primitives. So you have one that's one of the principles or one of the the three things that a language needs. So it's got the primitives then it defines new means of combination of those primitives. So this layer B is now defining means of combination that's number two and then layer and then it names them so that's three and then layer C can now use stuff from layer B as primitives. So each layer is actually a whole language. That's what the paper is trying to say. Each layer is a language built on top of the layer below it. So it's like a stack of languages. Okay another thing they talk about in the paper is what's called the closure property. That's closure with an S. I know I talk about closure with a J a lot the programming languages is closure with an S the actual word closure. Now what this is is if you have a function that returns the same type as it takes as an argument. So for instance addition takes two numbers and it returns a number. Multiplication takes two numbers and returns a number. String concatenation takes two strings returns a string. Okay all of these operations have the closure property. So what that means though is that you can arbitrarily nest the expressions. So you can put a plus inside of another plus inside of another plus inside of a times because it also participates in this numbers to numbers thing. So you the reason why addition and multiplication feels so expressive like I can do anything with these and I can express any combination I want. The reason it feels like that is because of the closure property. It allows them to be nested and so you can have recursive expressions. Okay. So what this means though is then you have like an infinite number of possible expressions once you have recursion you don't have to just get longer you can get deeper to express what you want. And this this is where I think the line is. It's when you have a DSL you've actually crossed this line of having recursive expressions. So some people might say that the string API you know this the standard string API is an API where you can concatenate strings you can get substrings you can search for a string within another string given a regular expression you can repeat strings etc or you can trim right notice that all of these take a string and return a string. So one thing you see is that these can be chained arbitrarily and that's very expressive. You can nest them arbitrarily so I can concatenate this weird expression of strings with this really complicated expression of strings I can concatenate them together and get a new string. And so the reason it has crossed the line is because you're always dealing with strings once you have a string you have an infinite number of things you can say and this is different from if you say say it didn't return the string say it just modified the string the only thing you could do is sequence them so you could get a longer and longer sequence of commands but you couldn't get that infinite nesting that recursive property which allows for these really complex and intricate ways of combining them together. Well I think that's it I think that's the the line right there. I don't know what else to say about it I can just give a bunch of other examples. So an example of something that we chain a lot in functional programming is like sequence operations map filter reduce etc. Notice they all take a sequence and return a new sequence except for reduce and then reduce kind of ends it right the end of the chain you might do a reduce to like sum the things up or average them or what have you but then you're done there's no more to do so the reduce does not have that closure property but map filter you know all the other ones like the opposite of filter which does remove something like repeat or cycle all these these sequence functions sequence operations that take a sequence and return a new sequence they have this closure property and that's what allows them to be richly nested and chained in basically infinite ways and it's what makes it feel like a DSL it feels very expressive. So this paper was you know written so long ago like it's crazy that we haven't like in all these online discussions this never comes up people just talk about it like I know it when I see it but I really think this is what it is. It's all about the closure property and nesting and well I think I'm gonna end it there so let me give a quick summary I've heard that people really like my summaries so I was I've been thinking about this question for a long time what is the difference between an API and a DSL and I think the paper LISP LISP a language for stratified design which I read and reread carefully for this podcast like finally gave me this insight into what it is obviously it's not my idea but I'm kind of applying it to this question that was in my head and this is the answer. So in the paper there are three things that define a language there are the primitives the means of combining things in the language meaning you can combine primitives but you can also combine the new things you've created by combining the primitives and then there's naming so that's three things primitives means of combining and naming naming just lets you put a name on something give it a little human level meaning and refer to it later by name stratified design is where you build a language in each layer on top of the layer below it and then the closure property is where a function takes a you know a certain type of argument and it returns that same type and if you have a collection of these then they can all kind of participate in nesting with each other and if you get enough of them you get this nice combinatorial explosion of the expressivity that you can do. One thing that I haven't I didn't mention is there's in a previous episode I talked about well like what is complexity and I kind of concluded that this is in the my response to out of the tarpet I concluded that complexity is the non-linearity it's not that complexity is non-linear it's the complexity is the non-linearity of a system so something like the number of possible interleavings of two threads the longer the threads are the more possible interleavings it's a it's factorial right so it's non-linear and it just explodes and it makes it much harder to understand what the program might do and notice we've got an opposite thing we've got a complexity meaning a non-linear way of combining these things we don't just add to the end that would be linear but the non-linearity is that we can nest deeply and now we've got this combinatorial explosion of the kinds of expressions we can do because of the closure property every new function you add into that into that ecosystem every operation that takes a string and returns a string will now it just increases the the different ways that you you know multiplies the different ways that you can that you can combine them and so this is a positive complexity it's a positive non-linearity it's a it opens up doors that the programmer can use without adding that much difficulty to the system notice you add linearly you add one more operation but it is multiplying maybe even factorial of the number of possible expressions that you can do so you got this linear increase in code size one more function but this huge multiplication of the number of possible expressions so I mean I'm kind of bringing this into kind of compiling this up into a theory of complexity and like good types of complexity bad types of complexity complexities that help the programmer complexities that hurt the programmer I'm not really sure where all of this is going but if you have a complex domain that does seem like it's super open such as graphics then perhaps the way to make graphics possible in a you know to try to tap into you know because there's infinity number of paintings right there's no no limit to the number of paintings and so somehow by defining a language a DSL instead of just an API you are better able to tap into that infinity right you're never going to be able to express everything with any particular DSL but imagine having to use a an API where the only way you could increase the number or so the only way you could get to another type of drawing would be to add to the end of the program right or add somewhere in there so you just increase the length there's no nesting so you're you have this like linear increase in the number of lines of code which gives you linear you know access to linearly more paintings right because you could choose what that line of code does it just gives you like a little bit more choice versus you have you can arbitrarily nest them this gives you a much richer set of choices so a linear increase in your code size will open up a huge array of the type of paintings you can do obviously you can't get all of them because there's gonna be certain assumptions baked in like for instance you might only be able to draw rectangles right you but and you don't want rec that like the paintings might not even have rectangles there's the paintings with zero rectangles in them and so you can't paint those but you can paint an infinite number of paintings with just rectangles so you would have to come up with a different DSL to let you paint something else but you are better able to spread out and and have a wider range of possibilities with only that linear increase in length okay this is pretty exciting I think that there's there's something there and I want to keep following this vein so if you want to listen in as I ID8 as I figure this stuff out you can go to lispcast.com/podcast and you'll find links to subscribe and reach me on social media part of the reason I'm doing this is because I like to be part of the discussion and reach more people so that if this does strike a nerve with you you know you you send me an email or something and we get in a real discussion so it's still me it's still people and people that's what social media means it means it's people it's a media made out of people but I don't like to look at it like broadcast only you know it's it's definitely good to broadcast to get the idea out there and touch a lot of people have contact with a lot of people but the reason you have contact with a lot of people is to make one-on-one connections easier more efficient right because I can't contact a million people one at a time linearly so linear increase in my time expenditure gives me a linear increase in the number of people I can contact and they might not be interested in this topic right so it's a big waste of time but you can do like a broadcast it's linear increase in in the time to record this and post it and then a potentially you know exponential increase in the surface area of the people I touch so and then can initiate conversations with so please look at it that way this is a call to you the dear listener if you are hearing this and you like this topic you are curious please reach out you'll find links to email me or get in touch with me on Twitter right on that site lispcast.com/podcast okay so my name is Eric Normand and this has been my thought on functional programming thanks for listening and rock on