Virtual Threads in Clojure

I gave a 20-minute lightning talk at the Madison Clojure Meetup. It was also broadcast online. I talked about the new virtual threads feature on the JVM and how we can access it in Clojure.

Download slides

Transcript

Hello, everyone. I don't know anything about closure. I'm going to talk about something I do know about instead. So, my name is Sainte. I'm currently a developer at Single Wire. But before that, I was briefly an active contributor to the accessibility module Chromium. Something I want to talk about was ways that CSS can actually affect functionally the behavior of screen readers and its auditory output. But first, let's make sure we're all on the same page. So, what is a screen reader? The way I would define it is that it's a type of assistive technology where blind or low-vision people they can use it to interact with digital interfaces, right? So that might be a computer, a phone, maybe a digital kiosk at a hospital, things like that, right? But it can't just be a naive reading out of text, right? You need to somehow communicate its semantics. For example, here I have an image of the web aim home page. And if I took a screen reader over this, it wouldn't just read out the text. It would say things like, hey, we have a navigation bar, we have a search input field, we have a heading that says web accessibility in mind, so on, so forth, right? And if you do some googling to figure out, like, oh, how does this work at a very, very high level, you might see a diagram that looks something like this, at least in the context of a browser. When a browser takes HTML, generates some DOM tree, and it implements something called the accessibility API. And to do that, it creates a data structure called the accessibility tree. And if you look at, like, hey, what does the accessibility tree look like, you might find some diagram like this where it almost looks like a version of the DOM that has additional metadata on it, right? So here, for example, we have a main element that contains a form element that contains some form controls. And maybe they have things like, oh, what is the form controls name? Like, what should I announce it as, for example? Was it state? And maybe what kind of programmatic actions you can take upon it, right? And so my claim today is that CSS properties such that the color of text, or maybe CSS transforms, like skewing or translation, affects the behavior of this data structure. So if you've been in an accessibility community for a while, you probably already know some examples of how CSS affects accessibility. One easy example is if you hide elements with CSS display none or visibility hidden, right? The idea is that if something is hidden from able users, then probably screen reader users shouldn't be able to interact with it as well, right? It would probably be a nightmare if you could interact with everything hidden on the screen, right? Another example is that CSS pseudo-element content is, quote unquote, considered whatever that means when generating this data structure. And if you don't know what a pseudo-element is, it's basically a fancy way of styling an element, right? So for example, let's say we have a paragraph that just says this is a paragraph. Then you can add a style that says, I'm going to add a four pseudo-element, I'm going to have content Hello World, and I'm going to style it red. And a browser that parses CSS, like no text only browsers. If it's conforming, it will output something like this. Hello World, this is a paragraph where Hello World isn't red, right? You can probably quickly see why a screen reader should be able to read content like this, right? Like it potentially will be problematic. There's just like a bunch of text that they cannot interact with basically, right? But it gets a little bit weirder than this, right? So these things that I mentioned are actually in the specifications, but there's lots of things that are unspecified. For example, there's a problem on the internet where lots and lots of web pages use table elements for layout, right? Because presumably before like the time of like CSS grid and Flexbox, right? So now we have a problem. We have a bunch of web pages that use HTML table elements for things like landing pages or things that aren't really tabular at all. And so if we expose these as tables in the accessibility tree that could potentially become very, very annoying, very, very quickly, right? So now we have to implement some kind of heuristic to try and filter out, quote unquote, "fake tables." So what kind of heuristics do browsers utilize? First, let's just take a look at an example. Let's say we have a simple HTML table, five rows, two columns. Each table cell has the word "test" in it, right? Any conforming browser should output just, again, a table of five rows, two columns, each having the word "test." And if you inspect the accessibility tree, it turns out no browser considers this a real table. The idea of being, like, this doesn't look very tabular, right? If we do something very, very simple, like making sure at least half of the table cells have a border, all browsers will consider this a real table. If we do something a little bit sillier, where we just have half of the table cells having a border on the right side of the table cell, Chrome even Safari will say this is a real table, but Firefox will not. And in case you're a little bit curious, there are more heuristics than just these things. But Chrome and Safari will agree on a lot of these because the code for this was originally written in 2008, and Blink was forked from WebKit after that, right? A lot of the comments are exactly the same, interestingly. So it's a scary piece of code to touch. But there's more than just this, right? So WebKit does something called, like, it tries to distinguish a real list and a fake list is, right? And how does it do this? Real lists have, quote unquote, at least one real list item. And what this basically means is that a list item is real if it has some kind of bullet marker to the left of it, or something that kind of visually shows that it's like a list item, right? For example, if we just have an HTML list where we have three items, Apple Orange Banana, and we add some styling to remove the bullet points. We just say list style none. Any Conformant browser will just say, OK, I'm just going to render the list, Apple Orange Banana, but I'm not going to render the bullet points because the author wants to remove it. And it turns out that WebKit will not expose this as a list. There's no list semantics here. And this is not something that Chromium and Firefox do. This is something the only WebKit does. And again, the heuristics are a little bit more complicated in this. For example, if you add a pseudo element that kind of tries to be a list marker, then the list semantics will be restored. But that's the idea. There's no bullet points, so it's not a list. And I was a little bit curious why this happened. And so James Craig, who's an active contributor to the ARIA specification, had this to say on Twitter, where basically it turns out that a lot of web developers were so for semantic markup that they tried to use semantic markup for everything, including maybe when it was inappropriate. And so screen reader users were like, damn, I'm hearing way too many lists. And I don't want to hear it. It's just so annoying. So Safari was like, OK, or WebKit, rather. We're going to try to do right by our users and filter out these excessive lists. Another example, font attributes. Screen readers care about things like, oh, is text bolded? Is text italicized? It was the font family. It was the font color, at least on desktop platforms. I'm not 100% sure if this happens on mobile platforms, but 100% happened on desktop platforms. So why is this needed? I did a little bit of googling. And it turns out, when this was first being implemented in around 2008, another developer had the same question I did. And if we look at a discussion on the Mozilla bug tracker for when Firefox was first implementing this, Jean-Marie Diggs, who is a big contributor to Linux accessibility since the mid '90s, had this to say, where maybe you're a screen reader user and you just care about the way screen read text is formatted. Maybe you're working with what you see is what you get editor, and you want to make sure an article looks the way you want it. Then maybe that case you care about the formatting. But another way is that formatting can be a very good way to navigate a long document that maybe doesn't have perfect semantic HTML, right? Like, for example, if a heading isn't semantically a heading in the HTML, another thing you can do instead is you can say, OK, I'm going to look for text which has a lot larger font size than the other text, and maybe it's bolded. Then I'm like, OK, those are probably headings. So instead of that way, I can understand the structure of the page in an alternative way, right? One final example, bounding boxes-- people at single wire probably already noticed, can I mention it? But when I defined the word screen reader, you'll notice that I mentioned that people with low vision use screen readers too, not only blind people. So maybe people who have low vision, they might want some kind of way to orient themselves on the page to see maybe where their cursor is or where their focus is, right? And so one way you can do that is wherever we have focus, we can draw a very, very obvious bounding box border around the active element, right? And so obviously, if we want to draw a bounding box across the active element, we need to account for things like width, height, and so on and so forth. And this also needs to account for CSS transformations, right? So for example, if we have a bunch of rotated, skewed, or scaled inputs, the bounding box around them needs to account for this. We can't just naively put it where the element would be without CSS transformations, right? I saw some code in WebKit while I was looking into the code for this, where for SDGs that are focused, they would try to instead return a polygon of points instead of just returning a rectangle, they'd return a polygon. So you could have a more accurate bounding box, right? But I wasn't actually able to trigger that code. So I don't really know what the code is for. I don't really know what the ecosystem is well enough. But currently, the bounding boxes, they only take four points. So we can't really make complicated shapes of them, right? And yeah, that's all I've got. There's a lot more examples than this that I could talk about. For example, Chromium tries to check what pseudo elements are from the Clear Fix hack, because the Clear Fix hack causes a lot of white space on the web, apparently. But yeah, there's a lot of weird heuristics. They're not specified at all. And people have attempted to put this stuff in specifications, but I guess there wasn't really any interest in doing it. So, yeah. - Because I don't want you to either use a screen reader. What sort of program does it scan for? But you see some very simple, how many variables do that for this? - It really depends on a platform you're on, right? So for Mac, for example, Mac has their own built-in screen they're called VoiceOver, right? If you're on Chrome OS, right? For like Chromebooks. And they also have their own screen reader called Chromebox, which is written in JavaScript, by the way. Windows has NVDA and JAWS. And Linux has one popular screen reader. It's called Orca. And I believe most distributions just have that built-in by default, but maybe not all of them. So yeah, Jean-Marie is actually the maintainer of Orca, but yeah. And yeah. So yeah, there was an interesting project recently, I think a couple of months ago, before I stopped contributing, where somebody was trying to make Chromium accessibility take up less memory, because before the project, every tab would keep its accessibility tree in memory. But now we're gonna say, okay, if a tab hasn't been opened in a while, we don't need to keep a cache version and accessibility tree, we just delete it, right? And like, what heuristics do we use to figure out when do we cache an accessibility tree and when not, right? So, but yeah. (muffled speaking) Yes. (muffled speaking) Right, right, right. It's important, that's just the story of the web, right? Like, the web is very, very forgiving of maybe non-ideal code, but that's just normal, I suppose. So, yeah. Yeah? (muffled speaking) Oh, that is a good thing. I forgot to mention this. So there, as you might imagine, querying the bounding box of an element in an application is very, very useful, right? You can imagine that being used in, like, a lot of non-accessibility situations. So, for example, Grammarly on Windows, the native application, it will query accessibility APIs to figure out the bounding boxes of things, right? So, I don't know what it's being used for, but you can imagine, like, maybe it queries the bounding box of an input field to show, like, the Grammarly logo next to it. And on Windows and Linux, you can also query the bounding box of, like, individual characters in, like, a text field or, like, a rich text box. So, maybe Grammarly queries the API to figure out, like, "Hey, like, what words are spelled incorrectly?" Then give me the bounding box of that so I can draw, like, a spread squiggly line or something like that, right? Apparently, it's, like, a problem because accessibility can add a performance cost. It's, like, the user's machine, right? So, normally, you don't want it to have it turned on all the time, but because so many applications, like to query the bounding box, a lot of people have accessibility turned on without realizing it, right? Like, for example, you know, your mobile apps maybe password managers request accessibility permissions and I don't really know the accessibility API calls they're making on mobile, but we can guess, probably, because they want to query the bounding box of an input field so they can show, like, a password thing below it, right? Yeah, yeah. The accessibility API? Well, anything that any kind of assistive technology, it's supposed to be a generic way to, right? 'Cause the concept of accessibility API is not just a browser thing. It's, like, any, like, native application can implement it and then, generically, like, assistive technology can hook into it, I guess, right? And interestingly, for some historical reasons, the Windows and Linux accessibility APIs are very, very similar, at least with an asterisk. But Mac is kind of very, very different, so, you know. - I don't know if that's part of us. - Yeah, no problem. Are we out of time for questions? - Yes. - How's the history of this? Who is it, who is it, the transcendent, or, I guess, the operating system is serving, right? - The accessibility API? - Yes. - It's just, like, it's just something a client can implement. It has nothing to really do with, like, the operating system, as far as I know. 'Cause they just, like, call, they just ask something of the application, right? It's a client-server relationship, right? You can ask, you can give me a accessibility regression, but there knows how to answer that. You can call it, I don't know if it's all inspired. - I guess it depends on, like, the operating system, right? So, for example, on Windows, you can send a message to, like, application, be, like, give me a root accessible object, and then it'll give you that, and then you can implement navigation methods on that root element, right? You can call them to get other, like, elements. You can say, like, "Ask for my children," and things like that, right? There was interesting. There were some people that tried to make the Windows Accessibility API what was also used on Linux. The kind of idea is to have, like, one single accessibility API, so we didn't have to have so many disparate ones across, like, operating systems. But it turns out that people on Linux didn't want to do that because they're like, "Oh, we already have one implemented. "I don't want to do the work to implement this other one," right? So, yeah. Yeah, so in case you're curious to-- Okay, well, I don't know how much time I have. I have two minutes. So, Windows has two competing accessibility APIs. One is called UI Automation, and that's one by Microsoft, and another is called I Accessible 2, and it's made by the Linux-- it's basically made by people in the Linux Foundation, right? And that's the reason why they're so similar. People who worked on I Accessible 2 also worked on the Linux Accessibility APIs, and the idea is that people didn't like that UI Automation was closed, so they're trying to make their own one, but it turns out lots of people are trying to migrate to UI Automation anyway, although some accessibility developers I talked with mainly the accessibility lead of Firefox doesn't like UI Automation, so they want to stay with I Accessible 2. So, I don't really know what's gonna happen there, but yeah. - Alrighty. Yeah, so I'm David. I work at Permantea Solutions. We do crazy crypto, blockchain, storage stuff. Talk to me about that sometime if you're interested. It's wild. Yeah, I'm gonna talk to you about closure and theory of constraints. I guess, first off, who here has heard the theory of constraints, anyone? A couple people, okay. Surprising, I mean, I know you would have, but I'm surprised you'd have. Alright. So, first off, I probably need to explain a little bit about what theory of constraints is. So, there's this book by Goldrat. I'm not gonna say his first name, 'cause I would not say that well, called the goal. And it's basically this business process improvement novel, it's kind of a little bit hard to explain, but the Phoenix Project, which I imagine more of you have heard of, the pattern of that book is actually based on this book. And it's a cool book, and I recommend it if you wanna read it or listen to it, like the audio book is really funny, 'cause it's like 80s style, like sort of like a radio production. It's very entertaining. Anyway, that's where this comes from. I don't actually know if TOC itself, theory of constraints, precedes the book of the book came first, have a look into that deeply. But at its heart, theory of constraints is really about these five focusing steps. So, I should say too, it's sort of like a process improvement methodology, but maybe broader, it's just like an improvement methodology. And it came out of the manufacturing world. So, it's thinking about things like systems of cues, basically, and how things flow through them. So, the first step for improving whatever your process is, is you identify the constraint. What is the bottleneck in your system? Then you exploit the constraint. I don't love the wording here, I just use the official, you know, blessed wording. But exploring the constraint basically means like, your constraint is probably not fully utilized, right? So, you throw all the resources you can at the existing constraint to try to make sure you maximize the throughput. Then you subordinate everything else to the constraint. So, that means you may have, like thinking about this in a manufacturing context, you may have overproduction, right? So, if you have a bunch of inputs to this constraint, and you're producing too much of them, that's waste. That's basically stuff you don't want to do. So, you should stop doing that. Subordinated to the constraint. And the last thing, before the repeat step at the end, is you elevate the performance of the constraint. So, that might be, you automate it somehow, you, there's any number of things you could do. But basically, you're trying to just increase the throughput of that constraint beyond just throwing resources at it. And if you do all that successfully, and it's no longer your constraint, then you go look for the next one. And the underlying assumption in this, which I think is not strictly true, is that every system has basically one constraint. I don't think that's actually fully true, but it's true enough you can kind of use it as a heuristic. So, that is the high level view of theory of constraints. Now, when you start getting into, I should have also, I guess I didn't see the time there, should have started clock. Okay, when you start getting into theory of constraints, there's a bunch of tools, like diagram tools, thinking tools in this methodology. I can't get into all of them, but I want to talk about one of them, which I think is kind of a good example. And it's a good example because it's fairly intuitive, like I think this will make sense when I explain it. But it's something we don't, I think, do enough. So, this is not my example. I just Googled for current reality tree theory of constraints and grabbed this and it's fine. You know, it's like a silly example because your problem down here is the cars in the swimming pool. So obviously, contrived and you don't need a diagram to figure this out. But the way this works is you start up here with these undesirable effects. There's different abbreviations for that. So I had to think about it for a second. So these are basically symptoms, right? The cars engine won't start, the air conditioning is not working, radio sounds distorted, and then you kind of work backwards from that. So engine needs fuel, fuel is not getting the engine, there's water in the fuel line, blah, blah, blah. And you get all the way down here, the cars in the swimming pool and your problem is, oh, you have a faulty handbrake, right? So, hopefully I don't have to explain this too much. It's kind of like, there's a lot of different versions of this, like root cause analysis, fault trees. They're all like versions of the same diagram, this is just like the theory of constraints one. And yeah, so like I said, the reason I chose this one is we do things like this a lot but we don't often write them down. And there's this, some of you are maybe familiar with Stuart Halloway's like debugging talk. It's the same kind of idea, right? Like it really helps sometimes to actually do this, especially in large complex systems because what you may have is a lot of these up here and you know, they'll be less obviously connected but when you start working backwards from them, you can also think like five-wise, it's kind of similar, you're sort of working backwards. What you'll find is there some connection down here and for these, it's not, the idea is not that there's always one connection, you might have a whole bunch of these and you end up like different connections but you'll start to see like, oh, this one influences five things where this one over here, it's only one thing. So maybe I should do this one first. And like I said, there's a whole bunch of these diagrams, there's a whole methodology about how you go through them but this is a really kind of nice one to use as an example and I think a useful one. All right. So you might still be thinking, but why? Why would you do this? This seems like a lot. Well, I hope to convince you this is worthwhile in some situations. First of all, I'm just gonna appeal to authority. We like rationales and closure, right? This is just fancy rationales. But more seriously, communicating why is hard. Like it's really, we all have different versions of these trees in our minds and sometimes we'll go into some meeting and we'll be like, I think this is the problem. And so there's a person like, no, I think this is the problem. And probably, you know, if you're lucky, you basically have overlapping trees and if you put them together, you'd be like, well, either I agree or there's something else, you know, that's causing all these things. And writing it down can be super helpful for that because bouncing around these things verbally is a lot. It often doesn't work because you have to like kind of, you know, synchronize this tree structure between your minds and that's hard. We often solve the wrong problems too, like all the time. You know, like the classic thing in computer science or engineering is you're optimizing the wrong thing. It's like you just run the profile, you know, figure out. And even a step beyond that, it's like you may be building the wrong thing. You know, that's the one that actually happens the most often, the entire business may be wrong. And that's maybe another problem, but another thing is we have, I think, pretty standard and good operational tools. Like if you do camman well, it works pretty well. It's not a strategy tool. But at least maybe it could be, you know, practice at the right level. But usually people are using as the ops tool. And so I often feel the need to think about strategically about the structure of problems. And our ops tools aren't very good for that. Problem structures are too big for our brains. Kind of talked about that with the whole synchronization thing. And then lastly, planning churn is really wasteful. So I showed you the current reality tree, which is kind of this fault tree thing. There's other versions that are sort of like planning trees. And what they're really doing is trying to show you paths through solution space. A lot of the planning tools we use, I mean, hopefully you don't use Gantt charts too much. But sometimes people use Gantt charts or other things that are like that. And what happens is you've got a path through a thing. It's really like a one-dimensional kind of view of a tree. And if your plans change, that thing is just gonna be all over the place, right? But if you have a tree or a directed graph, when your plans change, you're like, oh, I'm just moving from this part to this part. Now, it still may be hard to construct this thing initially, but it's actually gonna be a lot less churning it. And last one, this is just like a personal thing. It's like, I like to focus on a single problem. And when I have multiple problems, I get very anxious and confused. So I need ways to externalize these things. So this is like a functional thing for me to build these things so I can operate in the world 'cause the world doesn't let you just focus on one problem all the time. All right, I promise there will be closure soon. So what I really want is this. Like these interfaces, first, this is not really a theory of constraints thing, but I just love all the cues. Like I want views like this because I can look at this and be like, ooh, astronomy is gonna get me navigation. It's gonna, but if I'm not gonna get the free artistry or whatever, you know, I haven't really, I didn't even really look at what's on this one before putting it on here. But the point is when you look at this, you can think about the trade-offs you're making. And I want views of planning that are like this where the trade-offs are clear so I can decide what the next step is. And I also want this, but maybe that's actually a topic for future talk. I wanna see the cues flowing in. All right. Okay, so one more slide before we end closure here. So there are existing tools for this. This is actually a tool I really, really like. It's a weird tool. And I love these sort of like, you find these tools and I'm like, ooh, this is actually really cool, but it's weird. And it's got this sort of weird lineage behind it, but flying logic is one of these tools. I could talk more about it, but I can't 'cause I'm running out of time. It's really good. It does a lot of the like, you don't have to think about the layout in it. It just does that for you. And it's oriented towards you wanna make a DAG basically. And you don't wanna do a lot of drag and drop to move things around. It's good at that. However, I'm a programmer. Still too much clicky clicky. Not composable, you know, like if I want this thing in another diagram, I can't do that. Or at least I don't know how to do that. I don't think you can do that. It's difficult to relate diagrams. Like ideally, if I had a goal tree, I would like to relate things in other diagrams to the goal tree. It's really no way to do that. And it's like, it would be hard. Like what would the user interface even look like? It'd probably be a mess. There's no projections, so you can group things. But there's a lot of times, especially when I'm doing presentations where I just wanna be like, "Hey, I'm gonna present this." But these trees get really big. And it's like, I've tried to walk through them in the presentation. It's just a mess. What I really want is just to project a part of it into my slides. And integration is challenging. So you can actually script within this. I've never done it, but you can script it within it. It's maybe like, I don't even know what's like JavaScript or Groovy or something. And it has XML file for it, which you can parse. But that's not great, you know? It's just not what I wanna do. Okay. I'm gonna do it in time. All right. I gotta go fast. Okay, so what I really want, I want all the things. I want composability, I want views, projections. I want more complex relations. I want a REPL. I want editor tooling. This is a big one, right? Because I don't wanna, if I'm relating all these things, I don't rename something. I want my editor to, for me. I don't wanna have to mess around or email a bunch of things that are all the same name. So I want all these tools I'm familiar with because they would work really well for this. Okay. So, this is a joke, this is not exhaustive. Library options. So basically, this is like talk driven development. I'm like, I want this. I'll give a talk. That will drive me to do this. And two things pop into mind. So data script, I've used data script a lot. I love it. It's in memory day log, a prior experience, a bunch of prior experience with it. And then I also thought about Uber Graph, which is from Mark Engelberg, who knows the thing or two about writing good closure. And it has Graph to support out of the box. So, I did not do any TFC diagram for this. I just, I said, let's try Uber Graph 'cause I don't have a lot of time and I gotta make slides. So let's see how it goes. All right, so we're gonna get into Uber Graph here. Quickly walk through Uber Graph and then I'll show you a little bit of what I put together to make this a reality. So, this is undirected Graph. Edges are bidirectional. You're gonna define a Graph by saying, we have edge from A to B, B to A, A to C. Those are the edges. And then you can pretty print it. And it's gonna show you here are the nodes, here are the edges. Make sense? Okay, you can put attributes on edges, which would be useful if you wanted to do, like in a shortest path or something from two nodes. And Uber Graph has all these Graph algorithms which are somewhat applicable, but like a lot of them I don't use for this. But you can do this, right? So you can put attributes on your edges. You can put attributes on your nodes. You'll notice when you have attributes on your node, you pull out the node and you have a vector with like the node at the beginning, and then the attributes, and then you also have it repeated down here. That will come back later. Okay, finally, there are some algorithms that are kind of useful. So, for the problem, I'm trying to solve, the Graph algorithms are generally useful. We defined a directed Graph. So now I have the diagram there. And we can do a breadth-first traversal, right? So we have, now these nodes are going A to B, A to C, so they're not bidirectionally more. So if I do a breadth-first traversal from A, you can see we do have a connection, two connections going into B, but if we do it from A, A is only connected to B and C. So we don't get this D anywhere in our traversal. And this is useful if you have that like current reality tree, and you want to go down to the bottom, like you have a really big one, and you're like, what's upstream of this? You do this, and that's gonna give you all the nodes, right? And then you could use that in the diagram you're drawing. So the one Graph algorithm, I think, is actually useful as of now for the problems I'm trying to solve here. All right, so that's a really high level and not comprehensive introduction to Uber Graph. Okay, doing all right. Okay, oh, wait, I have slightly more. Yes, the reason I went down this path is opposed to data scripts. We can visualize it, so it just, again, this is not rocket science, right? We're just gonna print out some graph ways, but I didn't have to write this, so it's very convenient. If you do this one, what you're gonna get is a little JFrame, which will show it to you right away. Fine, I don't actually do that, I do this one, and then I have Kitty, my terminal, print it because your terminal should be able to show images, and if it's not switched to terminals. So I can have a little thing that's like watching a bunch of these and printing them out as I change them. And then finally, this is also a really convenient feature, which is just if you put attributes on these things, and the graph is attributes, 'cause graph is has a set of things that it understands, it will automatically use those. So just really convenient way to visualize these, built in, I didn't have to write it, it's nice. Okay, so let's do some TOC stuff with this. Let's make a goal tree. I didn't talk about the goal tree, roll with it. All right, so we're gonna define, it is what you think it is. This is for work, our goal is we're basically making this decentralized gateway network to access this crypto storage network. It's a whole thing. So our top level goal here is just, we want to have a sustainable network, that's the whole point of these systems, right? They're sort of self-perpetuating in some way. And critical success factor here might be decentralized indexing, and then we're gonna compose these things into a goal tree. So, fusing notice here. First of all, I'm just deafening my nodes at the top level. I'm using strings here as the IDs, and that's just because that's the default thing that we'll print in graph is. And then I'm adding some attributes, which, I don't know what I'm gonna do with those. Maybe those are useful later, we'll see. And then I just throw them into the graph down here. And the thing you'll notice is, in order to not repeat stuff, I have to do this first to pull out the entity ID, 'cause I don't. As far as I can tell, you can't just slap the entity in there. You have to pull out the, you know, the ID of it to get it to do the edges, which is fine, it's just the way the library works. Let me get that, that's amazing, right? Like this, so impressive. But, you know, it's getting there, right? So we're done, right? Well, maybe not. I don't wanna look at this. I mean, it's a lot. There's a lot of punctuation and stuff and repetition here. So, let's see if we can make it a little better in the few minutes we have left. All right, so macros. This is actually, I think, a good time to use a macro. We're deafening something. We don't want that symbol to be evaluated. We're gonna write a little macro. And really, the goal of this is I'm trying to make a little mini language within closure that I can use for defining this stuff. So, super quickly. We're just gonna do a deaf macro, which is a deaf goal. We're gonna take a var name and description. I've made some sort of opinionated choices here. First of all, we're just gonna keyword the var name. That's gonna be our ID. And we're gonna use the label attribute to put the text in there. So this can be something I can, you know, use programmatically more easily as opposed to that string. I have made the description, the decision, though, that the label is just derived from this. So I'm just gonna pull out the dashes and I'm gonna capitalize it. So these have to be readable. But again, this is a little tool I'm making for myself. I think that's fine. If I don't like it, I'll just go back and change it later. And I have this description in here. I never use this description in this talk, but I will in the future, trust me. Okay, so you know the cooking shows where they like throw us up together and then put it in the oven. And then they pull out the thing, we're gonna do that. So look, I wrote these other macros too. Trust me, this one is just the same. It's just critical success factor. It's gonna be the same thing. And then this one, I may or may not have written, but you know, it's basically just removing the redundancy and defining that graph at the end. So this is actually usable to me. This isn't the most earth shattering piece of closure ever constructed, but it's nice. It doesn't repeat things, right? I only have to type this once as used in a bunch of places. If I want to rename in this, it actually, you know, LSP just works out of the box on that. It's great. I can put these in different namespaces and relate them. I can get, I don't think I can actually get good LSP hover out of this yet, but it's probably just some metadata or something I need to add. I haven't tried that, but that's really my goal is basically editor tools should work. Yeah. So that's mostly what I put together. For future work, I think it would be cool to not spit out an Uber graph directly. I think I should probably just spit out some data structures that are more like what I want and then have something that runs and converts it to Uber graph. More graph algorithms might be useful. I'm not sure the, so far the traversal is the main one I think is useful. I'm also wondering if I should have a global registry. You know, spec does this. That may elicit groans and some people. But the experience I want is I just depth these things and then they're automatically useful in a bunch of contexts. So I can kind of have an aquarium, which maybe I can get that out of namespaces alone. Maybe I need something else. I'm not sure schemas. I'm also, do I need schemas for anything? I don't know, maybe, maybe not. But these last ones actually I do want to do. So I want to visualize it. I want to have better ways to visualize it. So I think they just displayed nicely in Portal or something like that. That'd be really cool. That's really the experience I want. So like developer tools for all this stuff instead of like clicky clicky UIs. And then data script. I do want data script because I want to be able to run data log on all these things. That's just super useful. But it's not the most immediately first useful thing. So I didn't do it yet. And then if I get all this working in a way, I'm really happy with all open source it because I know there are other crazy closure people out there probably watching this right now who would also play with this and have fun with it. All right, that's all I got. I got time for like maybe one question, something like that. If I convinced you all that TOC is awesome then you should go read weird books from the 80s. It actually is a fun book though, seriously. - (indistinct) - It's good enough because really what I want. I want to do a lot of views of it. I don't want the flying logic, you can actually navigate this stuff pretty well when you get a really big one. But even there it's like, do you ever really want to look at the whole thing? Not really. Like I just want to have little sub graphs and use those and look at those 'cause that's almost always drilling in on something. So I think it's good enough. It's not great though, you know, it's, it's graphics. Anyone who sees graphics knows. (laughs) Yeah. Eric, yeah. Yeah. Yeah. Yeah, exactly. Yep. Yeah. Because I think that's like, again, trying to solve the actual problem I have, just maintaining these things. And I think I'm best at maintaining code. So I should just turn it all into code. It's for me. It's for me to edit mostly and then present to other people. So I want a way that's really easy for me to maintain it. And then lots of ways to get it out. But other people aren't going to be editing it. Like it's hard to share these things. Honestly, even in any format. I could, yeah, no, I might. Like that would, that would totally work. You could have a notebook that has a little projections of your diagrams. I might do it. All right. I think I'm really out of time now. So thank you. Hello. I'm Eric Normand and I am talking about virtual threads enclosure. This is something that's very new to the JVM and I wanted to investigate it and write it up and present it here. So this is going to be just like a getting started. It's like 15 minutes. That's all we got. Okay. So he was just talking about TOC and now I'm talking about bottom that. I feel like it's a good synergy here. So if you read the JEP, the proposal for the feature, for the Java feature, they're going to talk about the size of threads, OS threads that they're really big. They're heavy. They're like megabytes. And that's true and virtual threads are a lot smaller. But I think really the issue more than the size is that there's a bottleneck. Like you can only create so many threads and you want to create more. So it's not about the size so much. I think that's like a distraction. It's that like you can crash your machine if you create too many threads. And okay. So why do we want to create more threads? This is little's law. It's very simple law. It's just a linear relationship. But it relates the concurrent requests that you're actually running. So let's say you have like an HTTP server and you're handling like one request per thread. So how many requests are you handling right now? Is equal to the request per second, how fast are they coming in times the amount of time it takes to process each request? These are all averages of course. And so if we look at this, we can see like, you know, your startup has a year one. You got 200 requests per second. That's like burst, the highest burst you'll see. And it takes 50 milliseconds to process each request. So you do the math and that means that you need 10 threads to handle those 200 requests per second. And then you grow and you're at 2000 requests per second, you're at 100 threads. And you start projecting this out, it's just a linear line, right? And you're going to hit a limit at some point, right? So I think I've crashed my machine somewhere around 2000 threads. Like if you just create them in a loop, like your machine just says, though something's wrong, it panics and just shuts down. It depends on your machine, right? But really the problem is the hardware limit is way up here. There's a certain number of threads you run and you're still not hitting like your IO bandwidth, you're still not hitting the CPU, even the cache bandwidth is way up there. And so we want more threads. We want, I mean, somewhere between here and there, anywhere higher than this line would be better, right? But it would be nice if we could actually get above the hardware limit and then it's our job to make sure it stays below that. Okay, so one possible solution is to use async programming. So just don't use threads basically is what that means. And we all know them. There's different ways of doing it. You got callbacks, core async is a like a fancy macro that creates callbacks, promessa, interceptors, ring async. Like people are trying to solve this problem. There's some benefits, they're lightweight. You know, it's like a closure per callback, right? And then they're garbage collectible. So you could have a callback waiting for something and then that thing gets collected so then your callback can be collected, it's nice. But big cost stack, you got callback hell, not fun. You can't get stack traces in callback hell and exceptions don't work in callback hell. You can't use existing libraries. Any libraries that were built without an async model and it's just like a blocking IO model or something, you can't use them. And you can't use existing tooling. So any kind of profiling or stuff that uses, it's like stack trace based, debuggers don't work so well. It's really hard. So this is the rationale for, it was called Project Loom. Now it's called virtual threads. They basically want everything. They want the benefits of async. So all those things listed on the left plus all the things listed on the right, on the previous slide. And so basically what it is, is threads implemented on the JVM. They don't use an OS thread. They use a thread pool of OS threads to run the small threads, the lightweight threads that are on the JVM. Also called green threads in some languages. Okay, so those are all the things. You get to use, you get your stack traces and your exceptions. You also get to use debuggers, profilers, et cetera. There are costs though. One of the costs is that now your bottleneck is somewhere else, right? It used to be the number of threads you could run. And now you really can just like over saturate your CPU like you couldn't before. And there are some limitations on them. You don't want to run CPU bound calculations on them. You should use a dedicated thread for that kind of stuff. So, you know, if you're like calculating digits of pi or something, any in-memory stuff you're working on, it's not good for virtual threads 'cause it'll hog the, they're not re-entrant, like an OS thread, right? So it won't ever get interrupted, like if you're in a hot loop. And we'll talk more about that. And there's this thing in Java called synchronized. Every object has a lock. It's called the monitor. And you can lock on that. It's made to implement synchronized blocks and synchronized methods. And you can't use that in virtual threads because I guess they're using like an OS level lock. And so it actually blocks the thread that it's running on. All right, some details. It is available officially like out of the box in JDK21, which is the latest long-term service release. You can get that at Adoptium.net. So it's like, it's out now. There's no special flags needed anymore. Every virtual thread is an instance of Java.lang.thread. So you get to use the same API you're used to. That means you can do like thread.sleep or thread.current thread or whatever it's called. So the recommendation, because they're so lightweight, it's just a little object. They're so lightweight, they recommend you do use them and let them run to completion and then let them be garbage collected. So if you need to make like 10 HTTP requests and you want them to happen in parallel, just make 10 virtual threads. And don't do like thread pooling or do anything clever like that. Just make them and let them die. All right, so we talked a little bit about this, things you oughtn't do, oughtn't to do. So one thing is hot loops, right? So if you're polling a data structure and it's like not ready, okay, just loop back again and pull. If you need to do a hot loop, like sleep a little bit, sleep 200 milliseconds, that'll give the thread to someone else to another virtual thread. But here's the thing, atoms and refs both use hot loops. So now like they're not so nice anymore. And we're gonna talk a little bit more about that. So an atom will, it'll try to make the change and if it has already been changed since it, so it'll read the value, it'll apply your function to it and then it'll try to atomically swap the value in the atom if it's changed since you read it, it'll just do it again. It'll just run your function again on the new value, just over and over and over. So it's a hot loop. There's no pausing or yielding in there. So I mentioned thread.sleep and thread.yield. I'm a little skeptical of thread.yield. I read the docs for the JEP and it doesn't mention yield at all. It mentions sleep. But then if you look at the Java doc for thread.yield, some people are recommending this, but I don't know. But it's very iffy. It's like, only use this for like, if you're debugging, it's like more like a hint, it's not guaranteed to do anything. So I would just use like thread.sleep with one millisecond or something. I think that's better. And then not using atoms. That's the recommend the solution. So another thing you shouldn't do is use the synchronize keyword. I talked about that. It's made for, it's in Java. It's not enclosure synchronized. But the locking macro does use synchronized. There's actually a story about a change that they had to make to the implementation of delay and lazy lists, lazy sequences because they were using locking, the monitor on. Just to make sure that only one thing was forcing the thunk, right? They were using a lock. And so now they have to use a re-entrant lock on every const of the lazy seek, which is kind of heavyweight, but you know, that was the solution. But now everything enclosure works with virtual threads, which is good. Okay, things you can do, blocking IO. That was something you couldn't do with core async go blocks, right? So all of the Java.io stuff, all the input streams, output streams, all that stuff, works well with virtual threads. So if you wanna saturate your IO bandwidth, you can do it with a lot of virtual threads. All the blocking primitives in the Java util concurrent works, all the locks, queues, futures, that kind of stuff. And even the blocking operations in core async, those work too. So you can use core async channels if you use the double bang versions and they'll work with virtual threads. And then again, you can do a sleep and yield and those work fine. So like one pattern people have started using is like, you know, start this thing going and also start another virtual thread that will notify me in a second, right? And that way I'll know a second has passed. And then it goes away. It's just like a little lightweight thing. Tell me in a second. All right, practical things. There's three ways to create virtual threads according to the docs. There's an executors instance that's called new virtual thread per task executor. It's in the Java util concurrent. You just create one, right? So that's what I'm doing here. I'm just deafing it here. And then you do, you call submit and you just pass it a runnable, just like a regular thread would take a runnable. And of course, functions work is runnables. So no arguments, you can just do your thing. The nice thing about the executor thing is it returns a future. So because I'm deafing it to F, F would be a future and then I can deref it or I can call the get method. And that'll block the current thread until the thing's done. Another way is with the thread, the Java.line.thread class. There's a static method called start virtual thread and that'll just start it and run. It'll create one and start it. And then there's this whole builder thing that I've never used, but it looks actually kind of nice if you do wanna configure your threads a little bit more. You can name your thread and then start it in like one line or you can create one unstarted to, I guess, start later if you want. All right, so this is the thing I've just been like really, there's racking my brain over, like closure. I've been using atoms forever. Like, isn't that the way we share state between threads? Like, that's the whole point of atoms, right? And now I'm not so sure. So one thing is you can still use atoms and volatiles and refs if you have a single writer and multiple readers 'cause reading from an atom is like cheap, right? It's easy. It's the writing that causes the contention that causes the hot loop. So if you have a thing where these guys are all reading from the keep going, right? And then this one is saying, okay, after 25 seconds, I'll just set it to false. That's fine, right? That'll work. You can do this loop 'cause we're sleeping for a second and you can read if these 10 threads can read from it and that's okay. Another way is with the collections. So they're not atoms. You're not gonna have, they're mutable collections, but they're thread safe. So if you have a bunch of threads, this is a thing where I'm like reading a whole bunch of different URLs and I wanna store the results in a one collection. So I have all of the, I create a virtual thread per URL. It slurps it and then saves it into a concurrent hash map. Okay, but because they're all using different keys, it's fine because they're never gonna have a collision. And if there is a collision, they're just reading the same thing and putting the same thing in and I don't care. It's thread safe so they're not gonna like accidentally like mess up the cache map. And then at the end, so I'm also using a latch. You know what a latch is? I need to write a whole thing on the Java, util concurrent stuff 'cause there's so many cool things. A latch is a thing that counts down. So I create one and this time I'm saying, there's count URLs in it. So how are many URLs I have? Let's say it's 10. It'll say, well, everyone will block until I get to zero. So every thread when it finishes just says count down by one. And then when it gets to zero, this await is going to unblock. So it's a way of coordinating threads. It's a way of like, I'm fanning out and then I'm gonna fan back in. And then I'm dumping the results from the concurrent hash map into a mutable one, which will then get returned from this function and inside of a promise, right, our future. Okay, then the other thing is like, why are we sharing state? Like we should just be like, it used to be that you would share state because you'd have like worker threads and they'd all just kind of bang on the state. But we don't need worker threads to bang on the state because they can just return. Each thread can return in the future the answer that they got. And so you can fan out to like a million things and then just wait for them all to be done and then fan back in. So like that's what this is doing. Like this is just saying, okay, we're starting one guy with this executor that's creating all of these other virtual threads, right? Each one is going to slurp it with the URL in a vector, right? And then it's doing do all. So it's starting all the threads and then it's deruffing all of them into a hash map. And then this is gonna get returned in that future. So like there's no need to like have something that they're all banging on. You can just fan out and fan back in using the futures that they've got. So I'm looking at it a lot like that now. I think there is some value in doing stuff more like kind of like an actor model where there is like kind of a persistence of each one of these things and they're kind of communicating. But like that also seems to be complicated. I still have to explore that. Okay, so there's all these ways to communicate and coordinate 'cause that's the hardest part of multi-threaded programming. Been through a lot of these. Corey, Sink and Promessa have channels. Manifold also does all this stuff in Java, you till concurrent seriously, like open up the Java doc and like do the package view. There's nice documentation that look like different categories. But just look at the classes that are in there and like read the one-line description of each one 'cause they kind of point at when you might use them. I'm always surprised 'cause like I don't use them that much and I kind of fake them. Like, oh, I'll just make a closure promise and I'll just like deliver on the promise when I want everything to start. Everything else will debrief it and wait there. Like no, there's a thing already in the library to do that and it's probably better, probably faster and cheaper. So anyway, there's all these things that you can use and they're very interesting. Okay, there's two things that are coming next. Now that the virtual threads are done, there's a thing called structured concurrency which is a way of representing hierarchical tasks. So if you have some task like, I'll have this webpage and I need to fetch all the resources that this webpage refers to, all the CSS, all the JavaScript, all the images. Well, that's kind of a hierarchy, right? And then each one of those might have other things that it needs you to fetch. And so you can represent that in the object graph, I suppose. So the thread will know that these virtual threads belong to it and let you, like instead of doing these patterns of fan out, fan in, it's like built in to the way, like there's a thing called join. And you're just waiting for all your subtasks to finish. And then another one is called, oh, and that's kind of like what Erlang does, right? Where you can have like processed trees, supervisor trees. Okay, scoped values. This is a thing. So each thread has thread local variables. They're kind of heavyweight for the size of a virtual thread. And this is a way of passing in a context or some other, like, you know, you're handling an HTTP request and you don't want to pass the HTTP request to like every thread and then it's sub threads and it's sub threads. You can define it like at the top level and anything can get it, anything can read it. All the threads below it. And it's immutable, yeah. I mean, right, it depends on what you put in it, right? You could put a mutable thing in it. But yes, it's set once. Okay, well, that was my presentation. And do I have time for questions? One question. Okay, well, we're going to show the screen. You were actually kind of concerned about what you were saying about the bodies of the screen in the mirror, right? What do you think you have to be concerned about that piece of thread? Really, okay. First of all, we did some much marketing and they were performing over the course and we stopped before. And then, like, you're doing a quad. And everything about it. (mumbling) Ah, right. So if you have eight cores, there's no more than eight. It's awesome. (mumbling) I see. One of them's going to work, yeah. Huh, that makes sense. That makes a lot of sense, yeah. Right. Because, I mean, in my basic tests, I don't know if I tested it well enough, but I had tests where I was creating like a million threads to see how much contention there was. And I couldn't detect that there was a problem, right? So, like, you might be right. I worry about deadlock, right? So, like, if you're setting the atom like over and over and over, like you're never releasing the virtual thread, right? So, like other virtual threads won't get in. The only things that could have been being private and heard of it. Right, but then they never get to run. (mumbling) Right, right. If you're. It would be supposed to be a pure fast function. Right. (mumbling) Right. As long as you're following the other rules. Like, that's the only, like, you have to sleep sometime, right? Or, like, yield or whatever. Yeah, yeah. Yeah, that works. Mm-hmm. That makes sense. Thank you. Any other questions? Great job. Yeah. Thank you.