PurelyFunctional.tv Newsletter 422: Don't write your own language
Design Tip 💡
Don't write your own language
A few weeks ago, I argued that Deepmind should have written their own programming language to do machine learning. There are many advantages to writing your own language for solving your specific problem. And writing a language is easier than it has ever been. However, as programming has become more social over the decades, the advantages you gain are dwarfed by community size and effort. In this essay, we will explore the downsides to writing a custom language.
I often remind myself of how much was done by a small team working on the Dynabook project at Xerox PARC. They bootstrapped a new language, a new programming paradigm, and a new UI paradigm. Over ten years, they iterated on their platform and built a huge system with a small amount of code. Could they have done that using C? I doubt it, and so does Alan Kay. He claims they needed to build the entire stack for any of it to work. Somehow, by starting at a lower level (machine code), but building intermediate layers (Smalltalk) and their own development tooling, they could leapfrog over the current state of the art.
There are many platforms and tools available for all stages of language implementation, including parsing, compiling, and running safely. And machines are bigger than ever; compilation is no longer a memory-intensive task. The barriers to writing a language are lower than ever. So should we write our own language for our product?
I think the answer is "no." Most companies will run out of money before they reach profitability. They don't have the resources to iterate on a language while building a product to sell. They need to sell to customers as quickly as possible. So they should not write their own language. They should choose an existing solution that is as good as they can afford and start building and selling.
However, that leaves the question of whether they should build their own language once they do achieve profitability. The real danger is that, in this fast-moving industry, a company could have a huge advantage just for starting later. The old joke was "the easiest way to make your software twice as fast is to wait 18 months." Well, nowadays it seems like "the easiest way to make your software easier to write is to wait 18 months." The tooling, services, and platforms will be that much easier to build your software with. There is probably some curve of productivity you could chase where the cost of re-writing is dwarfed by the extra speed you get. Maybe you should re-write, just so new companies don't eat your lunch.
The math is clear. Most systems are written in many more lines of code than they really need. We think of the difficulty of writing the tooling for our new language ourselves. But then there are cases like this from Alan Kay:
The [Smalltalk-76] system consisted of about 50 classes described in about 180 pages of source code. This included all of the OS functions, files, printing and other Ethernet services, the window interface, editors, graphics and painting systems, and two new contributions by Larry Tesler, the famous browsers for static methods in the inheritance hierarchy and dynamic contexts for debugging in the runtime environment.
Alan Kay's main research thesis for the past 15 years has been that even Smalltalk was more code than it needed to be. My own intuition says that the industry lacks the skills (through lack of practice) at building the lower layers of the stack that could make the higher layers easier to write. If we had more practice, it wouldn't seem as daunting.
Many people will argue that writing your own language will make it harder to hire. People worry that the cost of finding programmers willing to learn a new language and then teaching them that language will be prohibitively expensive. I don't think that's a valid reason. Every company I've worked for has had so many pieces in their stacks that it would be impossible to find someone who was familiar with all of them.
The real reason is not that the language is unfamiliar to new hires. The reason to worry about using an unknown language is about selling. Bootcamps and universities are churning out Python and Java programmers as fast as they can. Those programmers may use your libraries and products. It is the ecosystem of training materials, discussion boards, and educational facilities that is nearly impossible to replicate. It is not so much about the market of potential hires as it is about the market of potential customers. The hiring pool of AI experts is not limited by which language you use. But the non-experts you wish to sell to are.
When Alan Kay and his small team developed Smalltalk, the advantages of starting from scratch outweighed any advantages from the existing communities. This was before the web, before open-source code repositories dominated the industry, and before the giant network of online services existed. They could build it all themselves because it was a walled garden. They didn't need to interoperate, so they vertically integrated the entire stack, from network to hardware to software.
Now, Deepmind could have done a deep, vertical integration. They are building their own hardware and large, complex libraries. But they chose to write in Python and C++. They want their products to be useable by lots of programmers immediately. An interested programmer could buy a Python book and start using Deepmind's libraries. A new language would not have a plethora of books available.
State of Clojure Community Survey 📋
The survey results are in and analyzed! This time by Fogus. Check them out [here](https://clojure.org/n %20ews/2021/04/06/state-of-clojure-2021).
Currently recording 🎥
I am building a course on how to build a Clojure web stack from scratch.
To build a course, I often start by writing out a very complete guide to the topic. That guide will be published for free on my site. Newsletter subscribers (that means you!) see it first as an exclusive. I often get a lot of great comments and critiques about it. It's easy to fix in text. Once those critiques stop rolling in, I get about prepping for a video recording.
Well, I'm still writing the guide. It's over 6,000 words right now, and I'm not even halfway done. It might make a nice, slim book.
It might be just about ready to start sharing. Stay tuned.
Unexpected book 📘
A few weeks ago I got a signed copy of Learn ClojureScript, by Andrew Meredith, in the mail. I haven't read it yet, but I'm happy that one more book exists.
Quarantine update 😷
I know a lot of people are going through tougher times than I am. If you, for any reason, can't afford my courses, and you think the courses will help you, please hit reply and I will set you up. It's a small gesture I can make, but it might help.
I don't want to shame you or anybody that we should be using this time to work on our skills. The number one priority is your health and safety. I know I haven't been able to work very much, let alone learn some new skill. But if learning Clojure is important to you, and you can't afford it, just hit reply and I'll set you up. Keeping busy can keep us sane.
Stay healthy. Wash your hands. Wear a mask. Take care of loved ones.
Clojure Challenge 🤔
Last issue's challenge
Please do participate in the discussion at the submission links above. It's active and it's a great way to get comments on your code.
This week's challenge
te a function that takes a chess position as a tuple (such as
[:A 4]). If there were a knight there, where would it be able to move?
Have the function return a collection of all possible moves for that
knight. Assume there are no other pieces on the board. Be sure the
knight doesn't leave the board.
(knight-move [:D 3]) ;=> [[:B 2] [:C 1] [:B 4] [:C 5] [:E 5] [:F 4] [:E 1] [:F 2]] (knight-move [:A 1]) ;=> [[:B 3] [:C 2]]
Please submit your design process as comments to this gist. Discussion is welcome.