Older HN users may recall when busy discussions had comments split across several pages. This is because the Arc [1] language that HN runs on was originally hosted on top of Racket [2] and the implementation was too slow to handle giant discussions at HN scale. Around September 2024 Dang et al finished porting Arc to SBCL, and performance increased so much that even the largest discussions no longer need splitting. The server is unresponsive/restarting a lot less frequently since these changes, too, despite continued growth in traffic and comments:
"Btw, we rolled this out over 3 weeks ago and I think you're the first person to ask about it on HN. There was one earlier question by email. I think that qualifies as a splash-free dive."
It was running BC. I had high hopes for switching to CS because I'd heard the same thing you had, but when I tried it, HN slowed to a crawl. This stuff is so unpredictable.
Arc uses mutable cons, but Racket has immutable cons. So it's a problem.
In Racket BC mutable and immutable cons use the same struct at the C level, so both are quite fast and almost interchangeable, if you cross your fingers that the optimization passes don't notice the mess and get annoyed (somewhat like UB in C).
In Racket CS immutable cons are implemented as cons in Chez Scheme, but mutable cons are implemented as records in Chez Scheme, so they are not interchangeable at all.
Arc used a magic unsafe trick to mutate immutable(at the Racket level) cons that are actually mutable(at the Chez Scheme level) cons. The trick is slow because the Racket to Chez Scheme "transpiler" doesn't understand it and does not generate nice fast code.
One solution is to rewrite Arc to use mutable cons in Racket, but they are slow too because they are records in Chez Scheme that have less magic than cons in Chez Scheme. So my guess it that it will be a lot of work and little speed gain.
[Also, ¿kogir? asked a long time ago in the email list about how to use more memory in Racket BC, or how to use it better or something like that. I think he made a small patch for HN because it has some unusual requirements. Anyway, I'm not sure if it was still in use.]
---
The takeaway is that mutable cons are slow in Racket.
I was the person who emailed him about it earlier.
2024-09-05, me:
On another topic, I just noticed that the 700+ comments on https://news.ycombinator.com/item?id=41445413 all render on a single page. Hurray! Is the pagination approach obsolete now? I know that you've commented several times about wanting to optimize the code so pagination wasn't necessary. I don't know if that's finished or if pagination will have to go on the next time there's a big breaking story.
Dan G:
Yes: the performance improvements I've been working on for years finally got deployed, so pagination is turned off for now.
(In case you're curious, the change is that Arc now runs over SBCL instead of Racket.)
...
Btw you're the only person I know of who's noticed this and pointed it out so far!
I have very mixed feelings about how much I know about this site.
It's different, yes. The HN implementation is called clarc. PG suggested we spell it "clerk" as a joke on the British pronunciation of the latter, but I chickened out.
I talked to one of the Anarki devs (or at least someone who uses it) about possibly open-sourcing a version of clarc which would run the original open-sourced version of HN, but it's a bit hard because the implementation would require some careful surgery to factor out the relevant parts.
Yes because we just add the things we need, at whatever layer it makes the most sense to add them.
This type of application stack that includes the language you're writing in and even, when convenient, its implementation, is really satisfying to work with. There is much less need for workarounds, arbitrary choices, and various indirections (e.g. what used to be called dependency injection, for example). All the plumbing is an order of magnitude simpler and it allows us to keep the codebase much smaller than it would otherwise be. I also spend basically zero time bitching about having to deal with software dependencies, making me realize how much of my former life as a programnmer was taken up with just that.
I think of this as sort of the unikernel form of application dev and of course it's a fine fit for a Lisp, since "write the language you want to write your program in as you write your program in it" is the natural style there. The tradeoff is that there's a lot of vertical coupling between the layers. If you want to factor out one layer for general consumption, e.g. to open source the language implementation so other people can build cool things with it, there's a fair bit of work to do.
Also, since the language implementation exists to run a specific application, we just don't bother supporting we don't need for HN. That too comes back to bite you when you want to open-source it!
HN has had 15+ years of work since the original news application was open-sourced; that's a lot of things-we-added at-some-point. Most of those are at the application level but some ended up in Arc and some in the Arc implementation it was convenient to put them there. This is especially handy when you have limited time to work on the code.
Replying while @dang is editing - so might be talking past current parent:
I suppose I shouldn't be surprised that Arc/clarc would be modified as news is modified (Arc sort of being built around news in the first place).
I just wouldn't expect there to be hn specific sauce in clarc that would make sense to excise if opening up clarc; AFAIK it's been stated that there's some secret sauce wrt fighting spam, detecting voting rings and so on...
Then again, thinking more about it, it sounds reasonable that some of that might land on the Arc/clarc side, not in news.
(sorry for editing on the fly - I can explain why I do that but I know it can be annoying when someone is trying to reply! and yes, all that sounds right.)
I think Dang is saying that you don't need DI. DI is a way of having some generic code be able to call some specific code when needed. If your whole stack is specific you don't have that problem - instead of the DI call site, you just call the function! Much simpler.
In my own game scripting scheme, I use implicit argument passing, like a cancellation token to async calls, and a rendering context used for immediate mode esque rendering.
It is probably the best Common Lisp compiler when it comes to type checking. However, it leaves a lot to be desired. For example, it cannot specialize an element type for lists. With lists being the go-to structure, if you attempt to (declaim) every function, you will immediately see how vague and insufficient the types come out compared to even Python.
The ability to specialize list parameter types would greatly improve type checking. It would also help the compiler to optimize lists into unboxed arrays.
Please don't tell me that static type checking doesn't lend itself to CL. The ship has sailed. It does work with SBCL rather well, but it can be better.
Some may blame the Common Lisp standard. It indeed doesn't specify a way for list specialization, but the syntax is extensible, so why not make it as a vendor extension, with a CDR? AFAIK CDR was supposed to be to Common Lisp what PEP is to Python.
I would use vectors and arrays, but in CL ergonomics is strongly on the side of lists. For short structures vectors and arrays don't make sense.
I think it is also a time to outgrow the standard and be more brave with extensions. A lot has changed since the standard. It is still very capable, but as powerful as CL is, some things just cannot be practically implemented in the language and have to be a part of the runtime. Yes, I'm talking about async stuff.
So I got the idea to see how difficult it would be to bolt on async runtime to SBCL. To my surprise the project is hosted on awfully slow SourceForge and project discussions are still happening on mailing lists. Sorry, but I am too corrupted by GitHub's convenience.
>if you attempt to (declaim) every function, you will immediately see how vague and insufficient the types come out compared to even Python.
Indeed, but 1) it is used by the compiler itself while cpython currently ignores annotations and 2) runtime and buildtime typing use the same semantics and syntax, so you don't need band-aids like https://github.com/agronholm/typeguard
But yeah, CL's type system is lacking in many places. In order of practical advantages and difficulty to add (maybe): recursive DEFTYPE, typed HASH-TABLEs (I mean the keys and values), static typing of CLOS slots (invasive, like https://github.com/marcoheisig/fast-generic-functions), ..., parametric typing beyond ARRAYs.
Let's be brave and deviate from the standard, preferably in a backward-compatible way, to provide the best achievable DX.
The CL committee, however smart it was, could not think through all the nooks and crannies of the system. Let's continue where they left off and progress.
Coalton [1] adds Haskell-style types (so typed lists, type classes, parametric polymorphism, ...) to Common Lisp, and compiles to especially efficient code in SBCL.
Describing Coalton as a CL add-on or even as a DSL has always seemed wrong to me. It's of course very tightly integrated with the Common Lisp runtime but it's also very different as an actual language. And I mean that in a positive way as being different from CL is both an achievement but also a requirement for doing what it does.
I just found it funny how Clojure's lack of cons pairs is enough to cause religious debates about its Lisp nature while (ISTR) adding symbols to Coalton basically requires foreign calls to the host system, but it still counts as a CL-with-types.
Wouldn't that be something that the tooling could deal with easily? I don't know if there is anything like that yet, but the last time I took a quick look at Coalton it seemed like some basic SLIME and ASDF etc support with its own filetype and Emacs mode to go with it could be potentially useful and fun little project.
Nice, pretty much what I had in mind. I think there could be some interesting potential there tooling wise. Combining a highly dynamic interactive environment with a good statically typed language sounds fascinating to me and it's something that at least to my knowledge has never been seriously tried. Only Strongtalk comes to mind but I have no idea how it was like in practice, and I assume the type system was something closer to Java.
It is not easy to tell because lparallel's documentation website has rotted away just the same as cl-user.net. Does anyone remember this beautiful wiki of CL libraries?
Anyway, it looks like lparallel is nice and has some very useful concurrency primitives, but it doesn't have lightweight scheduling, unlike Go. So no cheap async work with many open sockets, cheap context switching, better cache utilisation, simple language constructs and mental model for async tasks. Besides, Go has M:N scheduler. It has all these async benefits but in addition all the threading benefits. Such things can only be properly done by the implementation.
I didn't use Common Lisp for very long, but unbeknownst to me at the time, getting interested in (SB)CL was a bit of a turning point for me from being primarily interested in physics to being interested in programming and software development.
During my physics undergrad, I was pretty uninterested in programming and I was only interested in "pen-and-paper" physics. Numerical solutions weren't very intellectually interesting to me. I knew a bit of Matlab, Python, and Mathematica, but of those languages only Mathematica was remotely intriguing to me, but I ran into some contexts already where all the above languages where just too slow to solve some important problems.
I spent the summer before my Masters degree started trying to decide on what language I should learn and master, I didn't want to have this annoying situation where I had to mix and match between slow expressive languages and fast, clunky languages.
I almost went for Fortran, but then I happened upon some old threads about Common Lisp, and people discussing some concepts I wasn't really familiar with like metaprogramming and I got quite intrigued. Metaprogramming was the first software topic that I found intellectually stimulating in its own right. Before that, programming was just a means to an end for me.
I spent a couple months reading old Common Lisp books and learning to use it, before I then stumbled upon Julia, and found that it had just about everything I was looking for -- an active scientific community, expressiveness, performance, interesting metaprogramming facilities.
At that point, I pretty much stopped all my common lisp usage in favour of Julia, and still heavily use Julia to this day in my job as a software developer, but I credit Common Lisp (and SBCL in particular) with being the thing that actually convinced me that there was something interesting about programming in its own right.
> too many focus on SBCL + Emacs and then complain about Lisp tooling.
well yea, lispworks and allegro are expensive commercial projects. I wish sbcl, the defacto best open source option had better tooling. emacs is great and all for the true believers but I'm an unwashed vscode user. For plenty of reasons I can't justify it in my startup but I'd love to spend more time working with common lisp for personal projects but my time is limited so I prefer clojure or rust.
LispWorks and Allegro are both interesting, but I've found their IDE offerings to be very limited. I haven't used either since I was playing around with CL during Covid, but from what I recall, even the basic IDE experience of writing code was severely lacking: poor autocomplete, poor syntax highlighting, clunky interfaces. In most discussions I see about them, they're only recommended for their compilers, not for their IDE offerings.
I think LispWorks is fine (also look at these plugins https://github.com/apr3vau/lw-plugins - terminal integration, code folding, side tree, markdown highlighting, Nerd Fonts, fuzzy-matching, enhanced directory mode, expand region, pair editing, SVG rendering…) but I had this feeling with the newer web-based Allegro IDE (the poor syntax highlighting surprised me, did I do sthg wrong?).
It was and remained an esoteric mystery to me ever since I saw Nichimen's work (with it); Pricing was just out of this world to even consider at the time.
I use emacs regularly (in fact I have it running right now) and I think the complaints against it are perfectly valid. Emacs is awesome in lots of ways, but it also really, really sucks in lots of other ways.
But putting emacs aside, the SBCL tooling seems reasonable to me. The real reason I rarely reach for lisp these days is not the tooling, but because the Common Lisp library ecosystem is a wasteland of partial implementations and abandoned code bases.
It's also been my experience that LLMs are better at writing more mainstream languages, especially "newbie-proof" languages like Go.
In any case, I don't see why one would reach for Allegro or Lispworks over SBCL unless one really enjoys writing lisp by hand and needs specific features they offer that SBCL doesn't. I would imagine those features are vanishingly few.
I'm a lispworks user for a few projects. The killers, generally, for an enterprise project are the smaller binaries and java interface. I know of a few places that write gui apps in lispworks, but many (most?) projects with a user interface use some web framework stuff and only do the backend in lisp.
The java is a killer feature for lisp adoption. A lot of companies use java heavily and being able to easily interface with that stuff is often a technical requirement and if not a technical one, a management requirement.
my impression is that most CL these days is existing large closed-source codebases, hence the price tag for those compilers (you're not trying it out for a bit, you're funding the compiler devs to work full-time on the issues you're actually having) and relatively little open-source activity for "finished" things -- if you're developing against internal libraries, it's hard to open-source just the part you intend to
(work at a CL shop; mostly SBCL users, but maybe 1/3 of people are die-hard ACL fans)
In addition to the official reference to CMU, there is a second origin for the name.
SBCL - Sanely Bootstrappable Common Lisp
You see, when SBCL was forked from CMU, a major effort was done so that it could be compiled using any reasonably complete Common Lisp implementation, unlike CMU CL. Because CMU CL essentially could only be compiled by itself, preferably in the same version, which meant compiling and especially cross-compiling was complex process that involved bringing the internal state of CMUCL process to "new version".
SBCL redid the logic heavily into being able to host the core SBCL compiler parts in any mostly-complete (does not have to be complete!) ANSI CL implementation, then uses that to compile the complete form.
Meaning you can grab SBCL source tarball, plus one of GNU clisp, ECL, Clozure CL, even GNU Common Lisp at one point, or any of the commercial implementations, including of course CMUCL, and C compiler (for the thin runtime support) and build a complete and unproblematic SBCL release with few commands
>SBCL derives most of its code from CMU CL, created at Carnegie Mellon University. Radical changes have been made to some parts of the system (particularly bootstrapping) but many fundamentals (like the mapping of Lisp abstractions onto the underlying hardware, the basic architecture of the compiler, and much of the runtime support code) are only slightly changed. Enough changes have been made to the interface and architecture that calling the new system CMU Common Lisp would cause confusion - the world does not need multiple incompatible systems named CMU CL. But it's appropriate to acknowledge the descent from the CMU hackers (and post-CMU CMU CL hackers) who did most of the heavy lifting to make the system work. So the system is named Steel Bank after the industries where Andrew Carnegie and Andrew Mellon, respectively, made the big bucks.
Can we get a "(1999)" date on this, please? Only half joking becuase I see Common Lisp and, sure, I upvote ... but honestly, what's the purpose of this HN submission without context?
SBCL is obviously fantastic but let's contrast with another popular implementation: Embeddable Common Lisp. https://ecl.common-lisp.dev/
Top marks for SBCL performance but ECL can be a better fit for embedding into mobile applications, running on lighter weight hardware, and in the browser.
We upgraded to 2.6.1 about a week ago and switched to using the new(ish) parallel(ish) garbage collector. I still can't tell what the impact has been.
Claude Code (which is a wizard at analyzing log files but also, I fear, an incorrigible curve-fitter) insisted that it was a real breakthrough and an excellent choice! On the other hand there was a major slowdown last night, ending in SBCL dying from heap exhaustion. I haven't had a chance to dig into that yet.
I'm going to caveat this by stating up front that obviously HN's source code is not public so I don't know what your hot path looks like, and that I'm not a domain expert on garbage collection, but I do write a fair amount of lisp for SBCL.
Immix-style collectors, like the new GC in SBCL, only compact on an opportunistic basis and so you get fragmentation pressure under load. In that situation, you might be well under the dynamic space size cap but if it can't find a large enough contiguous chunk of free heap it will still die.
So, fragmentation would be my prime suspect given what you described.
SBCL doesnt know when it's running low on available heap space? clisp uses libsigsegv, so it knows when to garbage collect really, and when it's not so needed.
No problem. You might be better off moving back, yes.
My understanding of immix-style collection is that it divides the heap into blocks and lines. A block is only compacted/reused if every object in it is dead, and so if you mix lifetimes (i.e. lots of short-lived requests, medium-life sessions, long-life db connections/caches/interned symbols) then you tend to fill up blocks with a mix of short and long-lived objects as users log in and make requests.
When the requests get de-allocated the session remains (because the user closed the tab but didn't log out, for example, so the session is still valid) and so you end up with a bunch of blocks that are partially occupied by long-lived objects, and this is what drives fragmentation because live objects don't get moved/compacted/de-fragged very often. Eventually you fill up your entire heap with partially-allocated blocks and there is no single contiguous span of memory large enough to fit a new allocation and the allocator shits its pants.
So if that's what the HN backend looks like architecturally (mixed lifetimes), then you'd probably benefit from the old GC because when it collects, it copies all live objects into new memory and you get defragmentation "for free" as a byproduct. Obviously it's doing more writing so pauses can be more pronounced, but I feel like for a webapp that might be a good trade-off.
Alternatively you can allocate into dedicated arenas based on lifetime. That might be the best solution, at the expense of more engineering. Profiling and testing would tell you for sure.
A well known quantum computing company's entire stack runs on SBCL, with Emacs in production... works really well, don't knock it until you've tried it. Phenomenal REPL.
Eh, let tosh have his/her fun. There's not so many submissions that it would qualify as spam, and SBCL is cool! A fun reminder of less majoritarian approaches to SWE.
(Plus HN runs on it, so these threads often end up sparking some discussion of HN internals, which I think many of us enjoy.)
https://news.ycombinator.com/item?id=41679215
[1] https://paulgraham.com/arc.html
[2] https://racket-lang.org/
reply