PurelyFunctional.tv Newsletter 354: Tip: beware how many threads you start
Your friendly reminder that if you aren't reading Eric's newsletter, you are missing outโฆ
Lots of great content in the latest newsletter! Really glad I subscribed. Thanks, Eric, for your work.
Eric's newsletter is so simply great. Love it!
Issue 354 - December 02, 2019 ยท Archives ยท Subscribe
Property-Based Testing Course Launch ๐
Folks, if you are reading this, there might just be time to get 50% off of the three Property-Based Testing courses before the price goes up forever.
I was planning to end the course today, but I didn't have great descriptions for the courses before (and holidays got in the way), so I've decided to extend the sale for a couple more days. Wednesday is the absolute end of this sale.
The three courses for Property-Based Testing with test.check are:
- Beginning
- learn to let the computer write thousands of tests for you
- Intermediate
- gain more confidence in stateful code
- Advanced
- learn to test the untestable (parallel and distributed systems)
Each course builds on the last one. Along the way, you learn to build custom generators, test at four different times in development, and how to integrate with Clojure Spec. Wednesday morning I will wake up and double the prices of these courses. Buy now or forever pay more.
Clojure Tip ๐ก
beware how many threads you start
It's really easy in Clojure to start new threads. The JVM uses OS threads, so each one you create is about as efficient as a thread could be. However, how many threads can your OS handle?
A common way to start too many threads is to start a new thread per server request. For example, every time an HTTP request comes in, you spawn a thread to calculate part of the answer. If you get thousands of requests in a short time period, you'll create thousands of threads all running at the same time.
Consequences
The least problematic consequence of creating too many threads is that
it throws an OutOfMemory
Error. This is meant to be an error that you
can't recover from.
I've also had systems that did worse than this. If you continue to
create more threads despite the OutOfMemory
error, you can crash your
entire operating system, forcing a restart. That has happened to me
before, and in fact happened while I was writing this newsletter.
;; do not run this
(dotimes [_ 400]
(let [p (promise)
f (reduce (fn [f _]
(future
@f
(Thread/sleep 100) ;;; @A
@f))
p (range 100))]
(deliver p 1)))
In the code above, we're creating a chain of futures
100 long, each
waiting for the next. And we run that 400 times in a loop. Without the
(Thread/sleep 100)
on the line marked with @A
, it runs fine and only
a couple hundred threads get created. Normally, futures
run in threads
from a thread pool, so the threads get reused. But if they take time
(like 100 ms), they can't be reused fast enough, and thousands of
threads get created.
What's more, the threads are getting created inside of other futures, so
the OutOfMemory
error is not stopping the outer loop. Threads keep
getting created, and the OS decides to clean house. Forced reboot.
I have also seen it where the threads are created but there are so many most time is spent on scheduling. As far as my code was concerned, nothing was getting done. The threads were blocking and unblocking on each other, and all of that overhead dominated the small amount of work I asked them to do.
What to use instead
Just to be clear, futures are okay to use for very short tasks and if they aren't chained together so much. If they're chained, you're blocking one thread waiting for another.
If you need longer-running things to run in parallel, you should use an
ExecutorService
. ExecutorService
s manage a thread pool and a queue
of tasks to feed to the pool. It handles problems like threads dying (by
restarting them) and other things you would normally have to handle on
your own.
It's beyond the scope of this email to explain them, but I have a short tutorial here.
By using a fixed size thread pool, you can guarantee that all tasks are handled as fast as they could be on your hardware without creating unbounded threads.
Awesome book ๐
Elixir in Action (affiliate link)
It's important to learn other languages, and it has been a while for me since I have learned something new. So I read this book about Elixir. The book is excellent and gives a good example of what Elixir and OTP give you to make systems more reliable. I was impressed by what is offered by the Erlang VM as well as with the presentation in the book. It takes a single, simple example and through the chapters makes it more and more robust.
First Annual PurelyFunctional.tv Survey! ๐
The first annual PurelyFunctional.tv Survey is still open. Your answer to this quick survey will help me understand how to improve my videos and help you master Clojure faster and more deeply.
There are only four questions. If you've watched any of my video content, please take a few minutes to fill this out. I appreciate any answer you can give. And a big "thank you" to everyone who has already submitted an answer.
The survey will run until the end of the week.
Clojure Tool ๐จ
Emacs requires a bit of setup before it is really competitive as an editor in today's world. This starter package is minimal but gives you what you need to get started editing Clojure with Emacs. It also has nice installation instructions.
Book update ๐
I have approved the proofs for Grokking Simplicity Chapter 5. Expect it out any day now!
I'm currently working on Chapter 7, which is all about Stratified Design.
You can buy the book and use the coup on code TSSIMPLICITY for 50% off.
Clojure Challenge ๐ค
Last week's challenge
The challenge in Issue 353 was to try to improve your editing experience in some way. I didn't get any submissions, but I hope people did find some improvements.
This week's challenge
Levenshtein distance
The Levenshtein distance measures the edit distance between two strings. That is, how many one-character changes do you need to make between two strings to turn one into the other. The algorithm has a nice recursive definition on Wikipedia, which makes it easy to write in Clojure.
Your goal is to implement the Levenshtein distance as a function of two strings. It should return the edit distance.
Bonus: use memoization to make it more efficient, or use an iterative method.
As usual, please reply to this email and let me know what you tried. I'll collect them up and share them in the next issue. If you don't want me to share your submission, let me know.
Rock on!
Eric Normand