I believe Rust will benefit from the reality check that kernel development represents.
Kernel development is hard, and bullshit doesn't go very far in that context. Success for Rust in that environment (with some changes along the way) will be a proof of value.
You have a valid point although I wouldn't frame it as adversarial as much as mutually beneficial. There will certainly be some bullshit eliminated from Rust, but I would not be surprised if there is a similar quantity eliminated from the kernel. Even in the most scrutinized C codebase in the world, there are likely memory usage bugs. If Rust can find and eliminate them, while also improving its capabilities, we all benefit.
I think Rust is a good position in the sense that the language and community has a track record and culture of going and solving problems instead of sitting on them forever. I agree that many challenges of kernel development are going to end up strengthening and evolving Rust as a language.
Solving async traits, having a proper concurrency runtime story, reducing the reliance on third party crates to ease error handling does seem to take out forever.
While I agree that these are problems (and, they are being addressed), none of these problems are ones that have to do with Rust in Kernel. The developers actually keep a wishlist of potential or unstable Rust/libstd/libcore/tooling features they'd consider helpful: https://github.com/Rust-for-Linux/linux/labels/prio%3A%20met...
Pretty sure async traits are coming soon (next couple of versions?) which is pretty speedy considering what I understand to be a semi-thorny problem.
> having a proper concurrency runtime story
Do you mean the language/stdlib shipping an async runtime?
> reducing the reliance on third party crates to ease error handling
I for one don't rely on 3rd party crates for error handling often? Anyhow is the most common one I use, but mostly out of habit and laziness, not actual hard requirement....
> Pretty sure async traits are coming soon (next couple of versions?)
No. There are ideas for how to do it, but as far as I know they haven't been tested out yet.
It also relies on pretty large features that are not proposed for stabilization yet (GAT and existential types). When that is done and the implementation strategy is chosen there will also need to be a RFC cycle, a phase of ironing out bugs and finally stabilization.
It'll be ... quite ... a while... I can't see it happening this year.
GATs are apparently very close to stabilization, I think I've seen Rust 1.62 suggested as plausible. So that's a big step towards the likely design of async traits. But sure, async traits is not likely to be this year. Fortunately although for some reason async traits are on pjmlp's must-have list, they're nowhere close to the critical path for Linux, which again is written today in C.
"I think Rust is a good position in the sense that the language and community has a track record and culture of going and solving problems instead of sitting on them forever. I agree that many challenges of kernel development are going to end up strengthening and evolving Rust as a language."
So I picked some ongoing language examples where this isn't quite true.
Isn't it though? I think Rust's processes have done pretty well here. Mara has written somewhere about how effective it is - unlike in a setup like WG21 - to just do stuff in Rust instead of waiting around for some imaginary higher power to grant your wishes, write the code and raise a PR to land your change.
For example, suppose you really want Mutex::unlock(). Right now that's behind the feature gate, but it's not controversial, if you feel like this more explicit function call is helpful you could put the work in to stabilize it and get the gate removed in say 1.61.
[[ Mutex::unlock(guard) is just equivalent to drop(guard) since of course dropping the guard will unlock the mutex which is why you usually don't do either, your guard will go out of scope and get dropped automatically, so the reason to want Mutex::unlock() is that it reads more naturally ]]
In C++ the pointer provenance problem has been sat around for almost twenty years as an unresolved defect in, I believe, C++ 98.
In Rust, Aria was like "We should provide a new API to do provenance in a sound way" and she shipped it (admittedly as a nightly feature, not stable) before I'd finished all the prior reading.
Yes, if all error types from various libraries are `Send` + `Sync` it's easy, if not, you are running into issues where you need to wrap them in reference counted containers because some of them aren't copyable either. Crates like failure, thiserror and anyhow didn't simplify this (although I really like their general approach to wrapping low level errors in high level errors with a breadcrumb trail to details).
I like your input, as critical as it is. Rust is developed by a community/foundation in contrast to C# (Java+). Anders Heilsberg, a brilliant language designer/BDFE can ponder and decide, after careful consideration, top down. Consensus takes time.
> Solving async traits, having a proper concurrency runtime story
Something something let the Java guy cast the first stone.
> reducing the reliance on third party crates to ease error handling does seem to take out forever.
I've written a ton of Rust and I've never used any of these third party error convenience libraries. As far as I'm concerned there is no issue in need of solving around error handling in Rust.
The "please don't offend this language I have intertwined with my own identity with" attitude in modern times is so tragic.
I have always been, and will be, plenty of things, and if it makes you feel good to call me Java guy, when someone criticizes your beloved Rust, please do.
Meh. You often reply prolifically in threads that mention Java, defend/praise it profusely, and also seem to be very knowledgeable about the language as well as the JVM. So it's not entirely an insult when I call you a "Java guy".
But you're over-reading into my defense of Rust. My point is that the specific things you criticized as taking a long time for Rust are not actually taking a long time. My point with poking Java, specifically, is that Java is going on 30 years old and doesn't yet have an analogous feature. Moreover, since we're talking about Kernel dev, do C or C++ have a standard async runtime? Considering that there's little-to-no prior art for doing the kind of async API Rust wants without active garbage collection, I'd assert that it's hard to criticize how long it's taking. Do you know how long it "should" take? If so, how?
The lack of context is a pain. For instance, trying to open a file that doesn't exist just gives the error "file not found" without telling you the path you tried to open.
Fair enough. But that doesn't have to do with the error handling mechanism of Rust, nor would any of these error helper libraries fix that. It's just a criticism of the error type that's actually returned. A Java-style exception can be written that's just as lacking. No language that I'm aware of will automatically include function arguments in its errors/exceptions- it's up to the author of the error/exception type to include that info.
It's ironic that the framing of some parts in the article is that the kernel mindset is arrogance and self-assuredness and that somehow isn't applied to the rust developers approaching an ecosystem that isn't familiar. It reminds me of tourists who visit another country and try to get locals to do things the way they are familiar with as if something is wrong with the locals and their way of life. I generally agree with gkh's response here. It avoids the arrogance in the other way ("why can't these kids roll their own leftpad") and presents a more valid concern, that of the kernel's actual constraints.
The person who is leading this project and making it happen, Miguel Ojeda, is a long-term kernel developer. Whatever your experience might be of other RIIR evangelism, this project initiative is coming from an insider rather than an outsider.
I didn't critique the idea of Rust in linux, I am critiquing the attitude of the rust fellow asking for support of importing generic crates into kernel code and thus adding dependencies (that is, the point of the article).
The article is by Jonathan Corbet, who has been a kernel insider for a very very long time, and is generally a strong proponent of the way Linux does things as a project.
Your example is quite funny and also represents how i would approach the matter of becoming a linux kernel developer.
Even if C is not my primary language of choice, i would definitely try adapt myself to the ecosystem and not the other way around.
You have all the knowledge of other peers, manual, books, all the libraries, the whole ecosystem.. this cant be replaced.
Also there's something else about C nowadays, is the lingua-franca, the latim (or english) of programming languages. We use it to expose api's to others in any other language that want to consume it as a library.
There's something about culture that people often forget in tech.. it's the real backbone of any project that it's on its own feet.. and when you want to enter in a community you will be better of if you learn and adapt yourself into this community culture instead of creating cultural clashes into the community and try to overtake it (be it hostile or not).
People should be aware that this effort will make it possible to create rust-based kernel drivers and that's it. the RIIR folks are delusional and hype fueled and its better if the sane Rust community get away from them or start to get them back into reality as i bet they are not willing to expend 10 or 15 years of their lives rewriting big and complex piece of software for a likely no return as people will tend to keep using the software the have more community and that are stronger.
It's a much better approach for Rust or any other programming language to become research darlings and eventually become the primary ecosystem of a research OS that went well and is the thing that will replace Linux. The language alone wont do it, it must be able to be a contender to UNIX and POSIX, and whatever language that is in such a system will probably be the one that will become the dominant one in such a ecosystem.
Also another good approach is to virtualize the Linux Api like gvisor does is userspace or the fuchsia OS(and even FreeBSD) does in the kernelspace. So that you can create your OS and kernel in the best way you can looking ahead, and have this Linux compat layer where applications dont even need to be aware they are not actually running in Linux.
Yes, i'm sure there's an imaginary opponent downvoting my comment.
Also there's a lot into my comment, and people are not even noticing it in the whole context.
If there's no reason, why people get so upset? just move on if you are not being mentioned, as it will clearly be the case if i'm talking about "imaginary opponents"..
> The one proposing it is a long-time kernel contributor.
I'm not saying anything about people working on it specifically, if you read my comment, there's a clear separation between serious people and the hype crowd (which is not just RIIR, but now also cryptocurrency fellows, etc).. i can't say where the people working on this fits, i don't know them. Don't know from which part of my comment you took that conclusion.
There are quiet, clever, serious people doing the work, like Hoare, Matsakis and the people that are real enginners, i have all the respect for them (and im pretty sure Rust have tons of such a people). To be fair, all languages have all kinds of people but i don't know what happen to some of them that tend to attract a certain type of people more than others, like the feeling a got from Haskell community more often than others (but given the community was much smaller)..
For instance talking about culture, C succeeded exactly because it was a pragmatic language very simple and efficient like their founders to get things done. With this culture, things happened to be done around the language and we have the ecosystem we have today.. it's a great hacker spirit of more humble, hard-working, behind-the-cameras sort of people which i sincerely miss in the days of instagram, tick-tock and tech celebritism.
Anyone who believes that Rust will quickly replace C in the kernel clearly knows very little about Rust or the kernel, and definitely should not be taken as a spokesperson for either.
I suspect that this "RIIR" that you seem to believe is some kind of "movement" is just a random assortment of clueless people posting in random places.
A ruling clique spreads quasi-religious beliefs that boil down to "we're the future!" to prospective followers. The ones who join up, are inspired and push ever intensified (yet even more honest) versions of those beliefs onto others "we must get rid the old thing". In one or two cycles of evolution, their evangelism provokes hostility from the unconverted. Unwilling to confront their own motivations, the rulers ignore how the spirit of their beliefs implicitly sanctioned their adherents misbehavior "technically, no one said they want to destroy the old thing".
A simple pattern that rulers to turn a blind eye to the connection between belief and action.
Ugh yeah, now that you point it out, there was one. To be sure, my internal reaction to that comment was "oh, a troll or a loonie", but maybe they were serious, which is kinda worrying.
>Kernel development is hard, and bullshit doesn't go very far in that context.
I don't know what it is about Linux that makes people say this. Kernel development has most of the same constraints as any other embedded context, which Rust has plenty of focus on. No it's not as mature as C, but few languages are.
Plus if you go looking in the kernel you can still find plenty of bullshit hacky code. It's not special, it's just another random open source software. The quality very much depends on the individual maintainer of that subsystem and how much has been invested in that area.
I work on the kernel for a living, and I find this claim exceedingly dubious. We're currently talking about experimentally supporting modules written in Rust, which is an entirely different beast than replacing pieces of the kernel core. The barrier to entry for drivers is significantly lower, and driver quality can be much, much poorer than the quality of the core kernel.
Many parts of the kernel have been fine tuned for decades, and many of the kernel developers that maintain Linux are also C experts (myself included) who aren't going to slow down development to migrate working code to Rust. It's great that we can experiment and see how Rust goes for driver authors, but they are still bus API consumers, not core kernel.
As I understand it the crucial rationale for drivers is that drivers were anyway necessarily platform dependent which undoes one argument against Rust.
Today Rust does not overlap Linux in terms of platform support. There are (small but very much alive) communities doing Linux on architectures that Rust has no support for and in some cases has no plans ever to support. So this makes drivers the only case where choosing Rust doesn't mean some people lose out, as a platform e.g. with no PCI bus doesn't get to run PCI drivers even if they were written in C.
I expect that over the next say, five to ten years, two things will happen to greatly improve this, maybe to the point where you absolutely could rewrite core Linux code in Rust if you wanted to. Firstly, Rust will get more platform support. Linux doesn't really need Rust's "Tier 1" (Linus doesn't check every kernel release passes tests on all real Linux target hardware as I understand it) but clearly you want Rust to at least build and take patches for every Linux platform some day. Secondly, some older platforms will "rust out". If your community is nursing 30+ year old hardware and increasingly more maintenance work is shared between fewer shoulders at some point "Linux-next" is not a priority and your platform will stop being supported while effort moves to exciting new hardware.
There's active work being done on the rust gcc backend and it's progressing nicely. That should help with some of the platform concerns you (rightly) raised.
> Today Rust does not overlap Linux in terms of platform support.
Nit, I believe actually the Rust and Linux platform sets would be considered "overlapping sets" in the mathematical sense :), since neither is a subset of the other.
e.g. Rust platforms include things like the NetBSD Rump Kernel and Redox and I think one would be hard-pressed to claim that Linux supports those as platforms.
I'm interested in kernel development and I like the idea of working on it for a living.
Can you give more details about your job? What does it consist in? Is it mostly code-review? Or are you responsible for maintaining a part of the kernel.
Who is the entity that pays you, and what are the criteria they'd use to pay a new contributor to work on kernel full-time?
Finally, can you point me to beginner-friendly things to work on to get started? how do I know which part of the kernel I should study and contribute to?
Firefox is getting new "oxidized" component all the time, Rust is the recommended language for both refactoring and new development. Of course the lowest-hanging fruits are addressed first, but that's normal and advisable.
There's a big difference between "We don't employ the language's architects" (so far as I'm aware Mozilla also doesn't employ many WG21 members) and "We don't have any engineers who know this language". In 2022 you'd probably have to go out of your way to hire that many engineers and not get some people who know Rust.
Not to mention how much less experience you need in Rust to not blow everybody's feet off by mistake. I reckon if you have 10 years C++ and six months Rust, any Rust you write is already more likely to deliver reasonable performance without setting everything on fire than your C++. Because of the constant exposure to outright malevolent stinking garbage (in the form of other people's HTML, CSS and Javascript) the browser needs to be exceptionally robust, and C++ just isn't very good for that. So Rust is often a better fit for what Mozilla do.
Yet Chrome and Safari, the browsers that really matter in 2022, won't be moving away from C++.
Chrome folks have been playing with Rust, but seem more keen in improving their C++ static analysis tooling instead.
As for Mozilla, it is 10% of Rust code and lets see for how long Firefox still matters, given the existing 3% market, even EdgeChrome has surpassed it.
> Chrome folks have been playing with Rust, but seem more keen in improving their C++ static analysis tooling instead.
I would say that at this point that's good money after bad. Linus of course also put a bunch of effort into static analysis, that's what "sparse" is.
The thing you run into immediately is that your programming language doesn't express the thing you wanted to analyse very well. So you have to annotate your software (Linux is sprinkled with sparse annotations), and now you've added an extra opportunity for mistakes, because the annotations are transparent to the compiler, so you can write code which analyses as correct but compiles to something incorrect. "Hooray".
Please don't spout nonsense. They did not let go of "everyone related to Rust", large parts of Firefox are written in Rust and those parts obviously need to be maintained. New Rust code continues to be written, as others have pointed out.
What Mozilla did do was lay off many of the people working on Rust itself as their full-time job, as opposed to people who use Rust to do their work at Mozilla. And the Servo team, unfortunately.
As you can see from the breakdown someone posted [0], almost two-thirds of the Firefox codebase is HTML / JavaScript / Python (for tests) / assembly / Java (for Android), none of which is a candidate for being rewritten in Rust to begin with.
If you just look at the portion written in native system languages, Rust is slightly more than 1/3 of that code already and still climbing.
I have a very strong doubt about it because Rust debugging still sucks. No debugger allows you to evaluate function calls AFAIK, which is a very strong restriction.
I see reports of it working for trivial top-level functions with basic parameter types, but what about everything else? Like member functions, trait implementations, etc.
he means in the context of a kernel developer. im sure there is some nerd who would rewrite stuff to prove to themselves and shutup their inner imposter syndrome. but mostly its quite accurate.
ffs dont do it. no its not enough. you will always feel that feeling because you are born into this world and it absolutely makes no fucking sense. so you want to atleast feel competent in one thing. but you wont ever feel competent in life because this whole experience of existing is fucking ridiculous. bla bla computer bla bla ping pong. you are an ape and we are spinning around. imposter in the world. not in knowing a "job" skill. haha you comment is so deep i wanna give you a hug and buy you a beer.
I agree that the parent claim ("Rust will quickly replace C in the Linux kernel") was utterly risible, but your comments just seem like the mirror image of theirs. What on earth is an 'impostor language'? Imposture of what or whom?
I'm tired of having to deal with this culture war crap in our profession. These languages are tools. C, C++, and Rust all compile down to the same LLVM IR (or GCC if you stray from rustc). There are certainly semantic and grammatical peculiarities that affect how each of them do so[0], but by and large running a simple Rust program and a simple C program through Godbolt will do a lot to disabuse you of the idea that the two are irreconcilably different.
To anyone else who wants to write performant Rust, my advice: (1) no_std, if only to focus the mind, (2) .try_foo(), not .foo(), b/c allocation is fallible, (3) always set `opt-level` to at least 1 (1 is far further from 0 than 3 is from 1, ime), (4) use stack-allocated alternatives to heap-allocated types ('smallvec' or equiv vs Vec, 'smolstr' vs String, &c) even at the expense of overallocating buffer space, (5) exploit vectorisation where possible (e.g. SIMD), in general practising mechanical sympathy, and (5) parallelism is not a panacea, whereas cache locality usually is. Measure everything, but also: memorise every instruction and how many cycles it takes, and think in those terms - in terms of your assembled, perhaps-handmodified code - rather than unscientific laptop benchmarks. (Jeff Dean's famous 'numbers every programmer should know' are a good start but are just the very basics, and obv his exact values are long obsolete, in some areas [disk] more than others [CPU].)
[0] These are discussed extremely soberly and intelligently here, for you or anyone else who may be interested: https://kornel.ski/rust-c-speed
It's an impostor because it doesn't belong in the kernel. Or anywhere else for that matter - look at Firefox sources. It just causes fragmentation and its security is yet to be proven. From my POV, it's just written by people too lazy for C and too ignorant for C++.
bah to me people who write C and C++ whine too much should be force sterilised so that us real humans can have peace and quiet and write our machine code and bring our albatrosses to work. like you people who use querty? they should just stop smoking pot and learn to use morse code and why arent phone numbers in hex? its like the world is full of ignorant lazy people. i mean its just my POV. i dont mean i want to promote force sterilisation on C++ and C programmers for being too underdeveloped to function and have a home. i just am talking from my POV you know? anyway good comment.
The issue with Kernel development is that it's a dying profession. Nobody is interested in it. That's why there's an attempt with sexy new language, to bring more developers to work on it.
Probably less than 30 people in the world understand the thing as a holistic entity. From evolutionary terms, there's a giant risk of losing relevant technical knowledge to keep it up in the future.
I do not support adding it to the Kernel, I think we should just throw away the kernel entirely, but I understand why they're looking at Rust.
> The issue with Kernel development is that it's a dying profession. Nobody is interested in it.
That's a pretty wild assertion and files in the face of the highly active kernel development process.
e.g.
> Linus has released 5.18-rc1 and closed the merge window for the 5.18 release... 13,207 non-merge changesets were merged during this merge window.
> I think we should just throw away the kernel entirely
That's the dumbest thing I've see on HN in some months.
The kernel is deployed in hundreds of millions of devices worldwide and continues to be the dominant OS in many many sectors.
The issue is about the # of new developers joining the development if you look at the Linux kernel development as an organization, not about # of non-merge changesets merged.
> That's a pretty wild assertion and files in the face of the highly active kernel development process
No, OP is right. 99.9% of things added to the kernel in any given release are drivers (mostly for obscure server hardware that the average LKML lurker would never have access to). So while there's certainly active development, it's all centered around drivers, and the "kernel" part of the kernel are mostly stable (modulo the occasional greenfield project like eBPF). (Note that I said 99.9% of things added, I'm sure security patches make up a lot more than 0.01% of kernel patches).
> The kernel is deployed in hundreds of millions of devices...
There are billions of Android phones alone. Not to mention the huge numbers of servers, IoT devices, embedded computers, and all the SBCs in my closet.
To be charitable, they may have meant something more like 'personal devices which people use directly'. (Though maybe not - it would be odd to exclude servers, which are one of Linux's biggest 'clients'. Perhaps they meant to exclude the more mundane types: routers, lightbulbs, etc.)
That Linux kernel is full of Google specifics, uses a microkernel like architecture since Project Treble, can be compiled since years with clang, now uses Rust for Bluetooth drivers,...
"The timeline for shifting to an "upstream first" cycle for new features starts in 2023, with 2020-2022 dedicated to making it work for pre-existing functionality. The Pixel 6 is expected to be the first Android device to ship with the GKI and Linux kernel 5.10, marking a major step in this process."
Means this is still one year away, if it actually happens, and then only Pixel devices will support it anyway, as so far most OEMs couldn't care less.
Not to mention that upstream will never accept all the things that make Android Linux not really Linux.
Most embedded/embedded-like devices running Linux are running a modified version of the kernel with a heavily customized userspace. These are still Linux by any reasonable definition. I'll leave it at this.
The kernel isn't dying, it's niche. It always has been and it always will be.
Fortunately for the kernel, despite being niche, it has a rock-solid onramp forcing people to get into it. There will always be companies interested in the n'th degree of performance, both generally, and for some specific hardware. Someone has to go do the relevant kernel work for those things. So while you or I may never touch it, it is effectively impossible for a kernel like Linux to just rot away because nobody cares. It would require first a multi-year, if not multi-decade process of fading first.
It's on most of the smartphones on the planet. Plus it is the dominant server OS. Plus in the top 3 in many embedded categories (a highly diverse set of technologies)
Even if Fuchsia is crapware, Google has the money to keep polishing that ball of mud, and eventually force it as the new version of Android, quality be damned. The rest of the world will have no recourse but to switch. It will be painful for quite a while, but people will survive. I mean, people survive using Windows, and not enough people switch away from that, either.
I’m not saying that Fuchsia is necessarily bad, I’m saying that Google will do anything to get away from GPL code, including, if necessary, forcing Android to Fuchsia. It doesn’t actually matter if Fuchsia is any good.
From what I can see, the Fuchsia kernel is actually quite interesting. I like the foci on (1) capabilities and (2) message passing. It's not the most innovative thing in the known universe - in fact both of those concepts are of pretty late-80s-to-early-90s vintage, from the OOP boom when programmers were misspending their ill-gotten performance gains[0] – but they make a degree of sense. The userspace bits I'm less sure about. Like you say, it seems to be a non-GPL-ed clone of Linux. It's the kind of thing I'd expect of some cheap Chinese company. This kind of fragmentation is emphatically not a good thing for our industry and Google knows it, and I very much hope they don't get away with it, but I suspect its being a clone is exactly why it'll be a very easy transition to force on end-users. Programmers will never in a million years use it on the server side, though.
The biggest obstacle might be drivers. A “server” is defined more loosely than “an Android phone”. An Android phone manufacturer has the incentive to make sure that drivers for that hardware exist in Android, regardless if Android is Linux-based or Fuchsia-based. And Google can make Android switch to Fuchsia, and therefore can control where that incentive leads. Google, however, does not control what runs on servers, and server hardware manufacturers know that if they don’t have a driver in Linux, they won’t sell very much of their hardware, since existing hardware run Linux-based systems.
I meant to say "unikernel." Don't know where that brain fart came from!
> Google, however, does not control what runs on servers, and server hardware manufacturers know that if they don’t have a driver in Linux, they won’t sell very much of their hardware, since existing hardware run Linux-based systems.
I'm not sure if "It's the way things are" is quite the argument its made out to be. Some server hardware is beginning to look more and more like a phone/Chromebook. And things do change. "Not in a million years" was the argument made against Linux compared to the traditional enterprise UNIX vendors and where are they now?
If Linux developers are actually afraid of that, then they should simply switch the license away from the GPL, or dual license it, or do anything else than what they're currently doing.
You might not know, but it’s commonly held not to be practially possible to switch the license of Linux, since its copyright is not owned by a single entity – it is owned in myriads of small portions depending on who wrote that piece. Some of those people have since passed away, their copyrights now being held by their descendants.
Yes I know, I've heard that a lot and it's a weak and harmful comment that kernel people should absolutely not be making. It actually pains me to see it typed again. You don't actually want a software project to be like that, it makes it very difficult to enforce the license when it actually needs to be enforced because you can't get consensus from all the copyright holders. It also increases the risk that some copyright holder (like one of those descendents) goes rogue and you have another Patrick McHardy situation.
If there really was enough reason and will to do it, they would just track down those people, or they would remove that code and replace it with something else. Just like they've done every time in the past when there was copyright problems. Just like any other big open source project has done when licensing became a problem.
It might or might not be a good idea in general to not have a centralized copyright holder entity, but at least it cuts down on the humongous license flame wars, which, incidentally, is what you would get if you actually wanted to go through with an endeavor such as you describe.
Anyway, what Linux developers might be afraid of can not be mitigated by simply switching to an MIT license. What would happen is basically some modern variant of this:
> Probably less than 30 people in the world understand the thing as a holistic entity
As we only have two other "big" (popular) kernels to compare to this one, do you think they (Apple and Microsoft) have more people "holistically" understanding the entire kernel or less? Since the nature of those companies are closed-source, I'm fairly certain even less people understand those ones "holistically".
> I do not support adding it to the Kernel, I think we should just throw away the kernel entirely, but I understand why they're looking at Rust.
How would that work in reality? Re-use the existing tests to build a new kernel from scratch? Sounds like a very far-out idea that wouldn't help with any of the current problems, but I'm happy to entertain the idea and hear your reasoning here.
> How would that work in reality? Re-use the existing tests to build a new kernel from scratch? Sounds like a very far-out idea that wouldn't help with any of the current problems, but I'm happy to entertain the idea and hear your reasoning here.
While I would tend to agree that a full production replacement would be such a massive undertaking as to be impractical, https://github.com/nuta/kerla does something very like that - Linux userspace ABI on an all-new Rust kernel. (And even at this small scale, I find it mind-blowing that this worked)
AIUI, you can still build a minimal kernel that's easily understood as a whole. And patches to shrink the minimal build even further are highly sought after because they expand the usability of Linux in deeply embedded environments.
Oh no. We have some early career devs that put up patches to the kernel recently. They were super excited about getting to do that work, and it was a big day for them - as it should be it's awesome.
I guess I need to go let them know they are Nobodies. oxff said so.
There may not be many “big and professional” operating system projects but, at the hobby level, it seems that there is a lot of interest in kernel dev actually.
HaikuOS, SerenityOS, Redox, and ReactOS are all going strong. The BSDs continue to advance as well. Redox is written in Rust even.
I believe Google sees Fuschia as a true Linux competitor.
When you say “we should just throw away the kernel entirely”, what are you suggesting?
> The issue with Kernel development is that it's a dying profession. Nobody is interested in it.
That's simply false. I'd love to work on the kernel. I have immense respect for the people who work on it. Only reason I haven't tried contributing code is I don't think I'm skilled enough.
> 30 people in the world understand the thing as a holistic entity
Probably a lot more than 30. There are probably more than 30 PhD students studying kernels at this very minute. However, I think you're right that no one is interested. People greatly underestimate the effect the kernel has on the full OS. They think it's "just" the kernel. This is why so many people keep saying that Windows will get a Linux kernel in Windows 10, no, 11, no, 12, etc. It's just a "kernel", like a grain of corn, something insignificant. So there isn't much interest in working on kernels. It's as unsexy as it gets.
I can sympathize with the need to have all required source code in the repository and not having to fetch a bunch of dependencies at build time. Thankfully, cargo already offers a solution here: `cargo vendor` will download all the specified dependencies once into a local directory, which can then be checked into the source tree.
This maintains cargo's dependency resolution/update checking/etc, but also allows for all dependency code to be kept alongside the kernel code and audited accordingly.
I don't think it's even remotely likely the kernel will end up supporting Cargo in any form.
As said in the article, "the world is changing". However, the way the world is changing is that more and more people are aware of supply-chain attacks. While I acknowledge that Cargo isn't npm, it is still the case that of all the software that can not afford to just sorta grab things from the internet, the Linux kernel is arguably #1, straight up. There is no chance that the kernel developers are ever going to accept "well, I wanted some async stuff so I grabbed a v0.5.23 of a package that I found appealing". Even if they pull some stuff in, it will be through a review process and it won't be through cargo in general.
The argument against cargo or any equivalent being used in the kernel is stronger today than it was 10 years ago, and the first derivative is also positive. Probably the second one too, honestly. This isn't about kernel developers being old fogey sticks in the mud, this is about the kernel being such a high-assurance environment, the biggest, fattest target in the world, that things that make sense for most software packages don't make sense for it. And it has nothing to do with cargo specifically or Rust. It's just that the entire workflow afforded by cargo, or npm, or go modules, or the half-dozen Python package managers, or anything else resembling those things is simply not appropriate for the Linux kernel. The only way to make such a thing work would be to pin the versions so hard that you're effectively only using them as a downloader, not a package manager, and you might as well just have vendored code in the kernel repo anyhow.
> There is no chance that the kernel developers are ever going to accept "well, I wanted some async stuff so I grabbed a v0.5.23 of a package that I found appealing". Even if they pull some stuff in, it will be through a review process and it won't be through cargo in general.
> The only way to make such a thing work would be to pin the versions so hard that you're effectively only using them as a downloader, not a package manager, and you might as well just have vendored code in the kernel repo anyhow.
This is the exact thing the person you replied to said should be done. `cargo vendor` fetches the sources for the package @ the version stated and embeds them in the repo. After that all deps would be sourced from the repository itself.
Nothing in the review process would have to change beyond adding a dependencies directory to the repo with a dedicated set of CODEOWNERS to handle reviewing patches with new dependencies.
Just to be clear, because I see this misconception frequently, using Cargo does not require using crates.io. You can set up your own private registry and configure Cargo to use it, if you like. You can even use Cargo entirely offline.
Will there be a massive review process every time someone updates the vendored dependencies? Or will each dependency change be reviewed one release increment at a time?
What happens if a dependency adds a system call something? Libraries intended for kernel-friendly use cases really need that scope to be an intentional goal.
Of course, they would certainly review any updates to vendored code. Just because most companies are too lazy to audit their dependencies doesn't mean the kernel needs to be.
As for syscalls, Rust has a whole ecosystem of no_std crates for kernel and microcontroller development that already assume the lack of an OS. We use (and contribute to) such crates extensively in our product (which is already developed jointly with the Linux Foundation, though we're not working on the Linux kernel (well we are a little bit, but all of that work is still in C :P )) and can vouch for their quality.
Ah this makes total sense. Other day I read how some factories are dumping pollutant chemicals in rivers. But it turned out to be all fine as there was no wrong intentions it just was not economically viable for them to properly dispose waste.
Congratulations, you found the solution to the problem. Hold companies liable for the damages their sloppy security practices create. Bruce Schneier wrote an excellent article on the topic in 2003 [1].
This is more a coordination/incentive/game theory problem than a cost one. If all the companies that use open source libraries contributed resources to pooled audits then individually they'd have to pay far less than each reviewing dependencies on their own.
Maybe that could be incentivized by penalizing data breaches caused by negligence and using non-audited code = negligence. But I suspect this would just result in people running some random static analysis tool and calling it a day rather than doing for proper code reviews.
Penalizing does not work, period. Religion tried with scare tactics for thousands of years. Sharing costs for auditing open source libraries depended on would need a platform to share the load, something like a Patreon for businesses.
> What happens if a dependency adds a system call something?
AFAIK, in Haskell, any function that returns IO must indicate it in the type signature.
If you could find a way to take that idea, expand upon it with much more fine-grained types, then you could build an ecosystem where any system call, external network call, use of env variables, etc is baked into the type signature of every function and package. If you could do that, you could build pretty trivial checks to ensure that a given package doesn't perform any kind of system call.
So sort of like the types of permissions we assign to apps on Android, but represented (and enforced) directly in the type system.
It would be difficult to make something like this ergonomic but could be pretty cool.
This works in Haskell because it is a purely functional language, and the IO monad is an intrusion into that to allow for useful computation. Rust is not purely functional, so there is no way to enforce something like this in the compiler. You could add an IO monad, but it would be easy for someone further down the chain of packages to ignore it and make a syscall.
Anything similar in rust would have to be enforced through code auditing tools, either forking the compiler, using some of its code as a basis or starting from scratch.
Yes, it is a shame that no version of that purely functional Haskell ideal has been created that could reasonably be used for kernel development. Using the type system to constrain side effects of code in the way you suggest would eliminate massive classes of security vulnerabilities and crashes.
IO is necessary because of purity, but it's in no way enabled by it. You don't need purity to have the IO monad, and you don't need purity to enforce that IO is defined in a type signature.
I feel like effect systems like the ones in unison and koka (or haskell with extensible effects / freer libraries like polysemy or eff) get pretty close to that. I feel like there's a lot of effort needed to make the idea "mainstream" and to get a good implementation going though (I think a lot of implementations use delimited continuations, but I'm not sure so don't quote me on that. At least it's the design for haskell's new primops to make that kind of library better)
If you're going to pick a "counterexample" just say unsafePerformIO, or the aptly named accursedUnutterablePerformIO[0]. (There are quite a few variants.)
Everybody knows that there are escape hatches because they are actually sometimes absolutely required for asymptotics or just plainly because it may be too hard to "prove" that your code is safe to the compiler.
That isn't a gotcha -- you know exactly what code you need to audit extra carefully.
[0] That one's actually in ByteString to technically not standard Haskell, but I just like the name.
I'm suspicious of someone who says they need to download and run /complex math/ libraries inside of a kernel module. I seriously wonder what they're planning on developing, and if a hybrid approach with most of that code being in user space wouldn't be a better idea here.
I wonder if this is a real and considered need, or a knee-jerk response to try to put everything in rust and directly in ring 0 just for the sake of it. I see no reason to try to break down the "divide between kernel and user space" in this way, and I wonder what's actually driving it here.
I wish this article went a little deeper into _what specifically_ these Rust users are trying to make modules _for_.
So now a kernel cannot depend on anything? What about all the tooling that’s required like python, perl and many others or the fact that you have to have a working machine with the kernel already on it to build the kernel?
Obviously you have to bootstrap from something. A more interesting case is NetBSD, which can bootstrap from just about anything; take a NetBSD src checkout, run ./build.sh, and it will first use the local tooling to build its own dependencies, then use those to build the system. So in some OSs, yes, you can more or less vendor in the universe. But that only works because *BSDs are developed as full systems, not just a kernel; Linux probably can't (and arguably shouldn't) do that.
Rust is alright as a language, but cargo is extremely scary. I hope a culture of coding in rust outside of the cargo "ecosystem" develops. The current situation is alienating to reproducibility-oriented developers.
If you particularly want to replicate the common C situation where your dependencies are in-tree or you just don’t have external dependencies and write everything yourself then you are totally welcome to do that, even with Cargo. Nobody is forcing you to use external dependencies. If your problem is other people writing software with external dependencies, then I guess you have different priorities to them.
My problem is dependencies taking on dependencies taking on hundreds of dependencies, making it more difficult to find a self-contained dependency which solves a problem, compared to C++ (outside of Boost or ffmpeg and such).
If you preserved an immutable tag of the source code and all its dependencies, a copy of the compiler version used and all build flags, then you’ve still got some big holes in your ability to reproduce a binary:
1. OS version & patches installed
2. OS configuration
3. Hardware used (processors can have weird subtle bugs, microcode can affect execution behaviour, etc etc)
4. Transient issues - the golden copy to be reproduced for some post event investigation might have contained a bit flip leading to impossible to reproduce verification signatures
Etc etc
Or is reproducability just a spectrum and you try to get further along it with some careful attention to detail, rather than an absolute to be achieved?
If the latter are you not cheaper just archiving binaries and tagging them with the source + deps + compiler + arch used to build them? Thats a 5 minute job to setup in your CI process and costs comparatively little to maintain vs wasting expensive human brains chasing down a futile goal.
It's normally taken as binary + transitive dependencies are bitwise identical on multiple builds. Needs a copy of a deterministic compiler along with your source. Whether the build runs the same on different machines is orthogonal to consistently rebuilding it.
This would definitely be done by "vendor"-ing all the dependency sources into the tree, no? Unless that's not the proposal, which I can't imagine it isn't, then I don't see how "web-based" is relevant.
Reproducibility is absolutely achievable. It is difficult with languages that have C system dependencies like C and Rust though. Usually you have to effectively check the compiler and system headers into your repo, which isn't ideal.
For something like Go it's trivial though. If you're using pure Go code then reproducible builds are pretty much as simple as "compile using Go 1.xx".
I don't understand why you think 3. and 4. are issues. Bit flips are very unlikely at compilation scale, and the hardware you use to compile something shouldn't affect the output. In either case you can just compile it twice and on different hardware and compare the result to confirm.
OP seems to be conflating the concepts of cargo (the rust build system with dependency management functionality) and crates.io (the main and default package repository today).
It is totally possible to use cargo without ever touching crates.io, but if you need any dependencies you'll of course have to provide them through some other means (local file system, git repositories, or a custom package repository).
I could totally see the Linux Kernel setting up their own alternative Rust package mirror that cargo uses: you get the ease of use of cargo while also getting the desired level of curation. It could either be treated as owned packages, only projects written by kernel contributors, or it could be the linux distro packager model, where blessed crates at a point in time are curated, but not necessarily audited beyond some smoke tests.
Cargo defaults to downloading stuff from crates.io. I certainly wish cargo was just a build system which doesn't care where you get your packages from, but that's not what it is.
That... is caring where I get packages from. Something like make or cmake doesn't care at all how packages end up on my drive. Almost everyone uses cargo in a way which makes cargo automatically download them from the web.
If you're making that argument, you could easily turn it around: make or cmake do care where your pacakges are; they by default only build things on your local drive.
The point is: all options are available for all systems, so suggesting any workflow doesn't work with any of these is just incorrect.
My point is that building code coming from my local drive is much less problematic than automatically downloading the code from some random website.
And I never said any workflow doesn't work with Cargo. I said I wished it didn't download code (by default) and that it didn't care how code ended up on my drive, that it just worked like a normal build system. That you can use Cargo in that way doesn't help.
> My point is that building code coming from my local drive is much less problematic than automatically downloading the code from some random website.
How does the code get on your hard drive? I'd imagine from downloading it from some random website (arguably _more_ random since it's less likely to be a centralized place like crates.io). Or you could vendor the dependencies so that they're included when you get the source code for the thing you're working on, but as mentioned throughout this discussion, cargo lets you do that too.
Yes. It turns out this works much better than the situation previously. The bespoke-per language model turns out to have fewer drawbacks than the bespoke-per-library-and-OS model. Note: fewer drawbacks, not none. I would not have predicted this a priori, for what it's worth, and I was very critical of and first one of these I became familiar with (rvm), but it turns out to work well in practice.
I still think cross-language efforts (like bazel and I'm sure there are others, maybe nix?) seem generally better, but I suspect there is some fairly good reason they are less widely used.
The problem with bazel and nix is that they are totalizing build systems. They want everything to bend to their worldview, and ask for an all-in commitment. Cargo etc require somewhat less of a commitment.
Right, but it seems like it would probably make more sense to pick a totalizing system, if you know you'll have a polyglot environment, which seems pretty inevitable for any business (as opposed to just like a side project). But I think there are probably good reasons this isn't the most common way to do things, and I just don't know them.
The reason is that defaults are powerful. Languages tend to have an owner in a position to declare a default for a language. No one is in a position to declare a default for polygot environments.
This also explains why the only places where these polyglot systems seem to be really common is at giant companies, where they have the power to enforce that stuff is interoperable. If your employer tells you to do extra work to make sure that your package can be used by others in the company, you're going to do it, but if you're working on some personal open source project and have the luxury of deciding what to prioritize, most people are generally not going to be spending time on figuring out polyglot build systems and will just use whatever's easiest, which will generally be whatever the language's default is.
Many businesses do pick totalizing systems -- many medium-size or above corporations use bazel, and a few use nix. There's also some tooling to convert Cargo.toml to bazel and nix build files, though I don't know how well that works.
I think the problem is that bazel and nix are just not that easy to get started with. If you're using them you're likely to have a team (or at least one person) working full-time on them, since bending everything to a singular worldview involves a lot of work.
The default behavior of cargo is to download stuff from the internet. This may be the least reproducible thing ever.
I'm honestly astonished that programmers of a language that is deemed to be "safe by default" thought that this behavior was acceptable in any form, not to say the default. If downloading things at build time is somehow necessary, it should be an obscure option behind a flag with a scary name, like --extremely-unsafe-i-know-what-i-am-doing, that prompted the user with a small turing test every time that it is run. Cargo is just bonkers, it doesn't matter at all if it is "convenient" or not. Convenience before basic safety and reproducibility is contrary to the spirit of the language itself.
It's as if bounds checking in the language was deferred to a third party that you need to "trust" in order to believe that you won't have segmentation faults.
It doesn't just download random things. Cargo generates a Cargo.lock file with checksums and will make sure that those checksums match when building later on. It's about as safe as vendoring all dependencies while being far easier to work with (though tools like cargo-vendor do exist, of course).
Edit: for things like the kernel, vendoring dependencies is still probably not a bad idea, of course
What prevents a given URL from disappearing? Does that just break a particular source version of the Linux kernel?
What happens when a given dependency adds new kernel-inappropriate features? Are kernel devs going to act like distro maintainers and decide between forking, maintaining patch sets, etc.?
All crate sources are stored in the crates.io package archive, which never deletes packages.
A dependency veering off in a direction you don't like is one of the risks of using someone else's code instead of writing it yourself. Cargo makes it easy to use forked dependencies, and forking a dependency is almost always less work than if you'd never used it and written the code yourself from the beginning. (And to be clear this is only a problem for future evolution; a crate author cannot remove or modify an already-published version of their crate.)
This is still fairly short sighted. Websites shut down, large websites with big storage demands are especially vulnerable to attrition. Who wants to pay the mounting bill for keeping decades of revisions of historical rust packages online?
I can grab the kernel sources from 1997 and build them today. Will I be able to build rust code from 2022 in 2047, because the 1997 kernel will still build at that date.
"I can grab the kernel sources from 1997 and build them today."
Where would you be grabbing it from? ...From a website? "Websites shut down, large websites with big storage demands are especially vulnerable to attrition. Who wants to pay the mounting bill for keeping decades of revisions of historical Linux kernels online?"
You make a copy, store it on your medium if choice, and put it in a filing cabinet. I gather that certain organizations use magnetic tape backups for especially important data. For some organizations and individuals, kernel source code could be that important.
There is a fairly large difference between archiving your own project's history for as long as you feel like, and archiving the complete history of every significant piece of code ever written in a particular programming language forever.
Who claims that archiving the complete history of every significant piece of code ever written in Rust is necessary? It is easy to archive only the code that your project depends upon. Rust code is no different from C code in this regard.
- crates.io is financed by the Rust Foundation and is at no risk of disappearing, it is a very well funded effort.
- Using cargo with an alternative repo is not difficult, requires some one-time configuration.
- Vendoring your dependencies is supported.
- cargo hits the network to look for semver compatible updated versions of your dependencies on specific moments if you don't have a Cargo.lock file.
- Not updating your dependencies stops you from getting the rug pulled from under you if an unwanted change happens, but it also stops you from getting any desired changes including security vulnerability fixes.
- Even if you vendor all of your dependencies, you still have to audit them the first time and every time you update them. Are you? Most aren't. Code you haven't written yourself can't be assured not to be malicious, and code you've written yourself can still have exploitable mistakes.
It's easy enough to keep your own website up as long as you want to, the liability is other projects and services, especially when the scope of those services is "archive everything for everyone forever".
So your argument is you think the people who run the crates site don't want to do a good job but the people running kernel.org do? What info are you basing this random-seeming decision on? Do you have any actual data suggesting that the crates site will just disappear like you say?
I'd like to see that data if so -- I have pretty big doubts that your statement has merit without some sort of evidence.
As I said in a parallel comment, there is a fairly large difference between archiving your own project's history for as long as you feel like, and archiving the complete history of every significant piece of code ever written in a particular programming language forever.
Kernel.org's repository is also of major versions, not every minor release and patch. That really wouldn't do for cargo. If it has ever been released, it needs to be kept in storage for as long as the rust ecosystem exists. That's decades, maybe even centuries of passing on the torch and hoping the next guy accepts the responsibility. Hoping you can find a next guy.
> I can grab the kernel sources from 1997 and build them today.
Can you? Do they still compiler with current compiler? You'll probably need to find a compiler of that time... And also all the interpreter for all the build scripts. Was that using bash or some old Perl? Maybe something more esoteric like m4 or tcl?
The point is that it always had many external dependencies to bootstrap. And adding one is not such a big deal, it just add another thing to archive among the many other things. The crates.io archive is probably not even that big.
I'm not sure why that would be a problem given most of these languages and standards are older than the Linux kernel. The thing about mature technology is specifically that it doesn't have breaking changes every couple of months. This is the way it used to be for a fairly long time.
But even if it has broken, I can just download an old linux distro. They effectively form a cohesive snapshot of the state of the toolchain whenever they were assembled. Slackware 3.1 from 1996 might be appropriate.
> But even if it has broken, I can just download an old linux distro. They effectively form a cohesive snapshot of the state of the toolchain whenever they were assembled. Slackware 3.1 from 1996 might be appropriate.
You will also need era-appropriate hardware to get that software to install.
I'd rather comment than downvote. Who cares about about a kernel build from 1997 (25 years ago)? What was the hardware back then, Pentium 2? Sorry for the snark in advance but: Why make mountains out of molehills? Life is hard enough as it is.
You may not own a Pentium 2, but someone might. This is only hard if you make it hard. My that an old Linux kernel, by design, can be built today. This is a feature it has for free, a consequence of not relying on flimsy network based dependency managers.
At any rate, we are indebted to the future to preserve the present, as our past has been preserved for our benefit.
"Never" is a long time, just saying. It'll be impossible to beat the "availability" guarantees of a local mirror (like a thumb drive) of a kernel source tarball.
What happens when a crate version has to be removed due to a critical CVE or court order (IP Law violation, perhaps)? There may come a day where crates.io becomes torn between not breaking Linux source and not hosting actively bad source code.
Note that some of those concerns do apply to vendoring source as well, but the additional download step also removes options that the kernel maintainers have as long as they ship all the source for the kernel in one tarball. Like more control over the timing of inevitable decisions.
> What happens when a crate version has to be removed due to a critical CVE or court order (IP Law violation, perhaps)?
CVE = The Yank flag. Cargo will refuse to add new yanked packages to a lock file, but if a yanked package is already in the lock file, it will still build. The package is not actually deleted. https://doc.rust-lang.org/cargo/commands/cargo-yank.html
Legal = Hard delete. Nobody will go to jail just to avoid breaking your build. Of course, since crates.io and kernel.org are in the same legal jurisdiction, is there any actual difference here?
What happens today when a kernel module has to be removed due to a critical CVE or court order?
That's not just a rhetorical flourish, I'm actually curious what the answer is. As far as I know, (1) it almost never happens and (2) when it does, the change is made in upstream repos and as a practical matter, everyone downloads those changes and their up-to-date local copies lose that code.
Fixing it in the future isn't the point. Breaking previous releases is.
The previous tarballs still work and contain the relevant code. Your build wouldn't rely on hosts complying with court orders in countries you might not live in.
If the code isn't vendored, just referenced with URLs, the old tarballs stop working.
This hypothetical court-order situation is quite far-fetched. If crates.io was ordered to take down some or all versions of a package, an alternative mirror could easily be created elsewhere and you could configure cargo to use it.
But I think the kernel would vendor crate dependencies, partly so that people can build without accessing the network, simply because that's policy in many places.
To the first question, obviously the sources of dependencies would be brought into the tree. This is easy and there's no reason I'm aware of not to do it for something like the Linux kernel.
To the second set of questions, how is this any different than any other dependency the kernel has? If the answer is "the kernel has no dependencies" then yeah, I'm very sympathetic to the argument that bringing in rust libraries is not a good reason to start having dependencies when none previously existed at all, but is that the case?
You're forgetting about custom build scripts. Thankfully most of the core ones have moved off cloning dependencies for ffi purposes (think cloning an alsa-lib version for ffi), but it used to be super common.
No, it is. Even without `--locked`, the Cargo.lock file is only updated when it no longer fulfills the Cargo.toml because the latter was edited (and then only making the minimal changes necessary), or explicitly using `cargo update`.
Yes, it's always read. If the file didn't require updating, a build with and without `--locked` will be identical. If it did require updating, `--locked` will make cargo exit with an error.
That's true when running `cargo install` to install an application directly from crates.io, but not when running `cargo build` in an already checked-out repository.
A cargo build ends up there calling into the resolver’s resolve_ws_with_opts() which would refresh the lockfile.
Not resolve_with_previous() which would use the lock file as-is.
The only reason this sticks in my mind is i ran into an issue building bat after i made some changes, i obviously assumed it was my changes so went through the process of debugging and backing out my changes until finally i was back to a virgin branch and still failing - passing —frozen —locked fixed it.
If your project has a Cargo.lock file checked into its repo, then everyone checking that out will download the same code for all dependencies (unless someone manages to compromise the crates.io package archive). That is very far from "the least reproducible thing ever".
> The default behavior of cargo is to download stuff from the internet.
This is borderline inevitable for most modern development stacks, though .lock files can definitely help, even adding hashes to check against if you care about your dependencies being the same as when you first download/add them to the project and/or inspect the code.
As for worries about the things in those URLs disappearing, in most cases you should be using a proxy repository of some sort, which i've seen leveraged often in enterprise environments - something like JFrog Artifactory or Sonatype Nexus, with repositories either globally, or on a per-project basis.
The problem here is that all of these repositories kind of suck and that the ecosystem around them also does:
- for example, Nexus routinely fails to remove all of the proxied container images and their blobs that are older than a certain date, bloating disk space usage
- when proxying npm, Nexus needs additional reverse proxy configuration, since URL encoded slashes aren't typically allowed
- many popular formats, like Composer (or plenty more niche ones) are only community supported https://help.sonatype.com/repomanager3/nexus-repository-administration/formats (nobody will ever cover *all* of the formats you need, unless you limit yourself to very popular stacks)
- many of the tech stacks that have .lock files may also include URLs to the registry/repository from which they're acquired, so some patching might be necessary
- in technologies like Ruby, actually setting up the proxy isn't as easy as running something like "bundle install --registry=..." as it is in npm
- in other technologies, like Java, you get into the whole SNAPSHOT vs RELEASE issue and even setting up publishing your own packages to something like Nexus can be a bit of work; the lack of proper code libraries for reuse and abundance of code being copy-pasted that i've been being a proof of this in my mind
Of course, i'm mentioning various tech stacks here and i don't doubt that in the long term Rust and other technologies might also address their own individual shortcomings, but my point is that dependency management is just a hard problem in general.
So, for most people the approach that they'll take is to just install stuff from the Internet that other people trust and just hope that the toolchain works as expected, a black box of sorts. I've seen plenty of people just adding packages without auditing 100% of the source code which seems like the inevitable reality when you're just trying to build some software with time/resource constraints.
Downloading C++ dependencies during the build process is equally unacceptable for many situations. Existing C++ build systems and package managers can be configured to do that and those build systems and package managers would be inappropriate for supporting a kernel that values stability and long term support.
So it's a good thing that cargo can be used without downloading dependencies during the build! Just clone the repos of the dependencies (and transitive dependencies), just like you would for a C++ project. Then set up your cargo file to point at the location for your local copy instead of using the default download behavior.
There's even a tool called cargo-vendor that does this for you!
I wholeheartedly agree. I love Rust, it's the most fun I've had with any programming language (barring perhaps Haskell). But I still run cargo in offline mode with crates.io disabled, pointing cargo to /usr/share/cargo/registry for dependencies (that's where Debian's librust-*-dev packages get installed, the only dependencies I accept in my Rust projects).
To say it explicitly: "I don't have any internet" was a hard constraint on everything from the beginning. Firefox's build requires it. Most distros require it.
Some of it was a bit awkward to actually use this way in the early days, but those harsh edges have been since sanded off.
Absolutely (although see note below)! I don't have any gripes with the tooling from a technical perspective. I do have gripes from a cultural perspective, though.
Addendum: Until quite recently, this was quite cumbersome. It also meant that all cargo invocations (by the same user) would use that override, always. It meant that compiling someone else's project became quite the hassle. Or compiling a project that mostly uses system dependencies, but some crates.io deps. But the situation is improving.
Cargo and crates.io have been specifically designed to be reliably reproducible, and had a chance to learn from npm's mistakes to not repeat them.
Cargo binary projects have Cargo.lock by default with checksums of all dependencies used. crates.io doesn't allow changing past releases, and has a policy of not deleting any crates unless legally required (user-accesible "yank" hides, but doesn't delete). Crates.io index is a git repository with full history of all changes to the registry, so you can recreate its state at any point in time (in case you lost your Cargo.lock, you can reliably remake one from the past).
And on top of that there's `cargo vendor` command that makes a local offline copy of everything you use, so you can fully archive a Cargo project and rebuild it any time later.
Yeah, especially language specific ones. It seems like the world wants to be polyglot. Do we really want to have all the encryption algorithms reimplemented in all the languages?
The answer there is often "use FFI". But if we're all going to use C APIs anyway, then shouldn't our package managers support C?
But C has probably the most awkward build system culture of any language, complicating the job of packaging quite a bit. A lot of language specific package management systems get lift from offloading the ugly C support problems to other layers. See python and "manylinux" for instance.
you are right about C. i mean i do get why we have so many. and i use them so often that its ingrained in me. and how to fix them when i mess up. but sometimes i notice how many libs we are talking about. honestly it is kinda ridiculous. i know how hard it is to manage one repo and keep it safe. maybe im too old. young ones now just download go binaries and run them.
In my eyes, the problem isn't precisely whether some components are downloaded separately. What I find problematic is the idea of giving up responsibility by dependending on separately developed components. Components written with more than just one particular project in mind are tough to change. Integrating them into a particular project often requires workarounds and some imperfect abstractions to make things fit, and there tends to be a proliferation of blind spots about "global optimization potential".
Not to drag on Rust, but I can see a "global optimization" problem when I try to cargo build a moderate Rust project and have to download hundreds of transitive dependencies, and have a good chance of ending up not building a thing (for reasons that probably include my own incompetence, but still).
There are of course tons of C projects that have similar problems in a way - the number of dependencies might be lower but I have to hunt them manually. But well-engineered projects are at most a handful of clearly identifiable dependencies, and everything will generally build without fuss. This is how it is with the Linux kernel today. The main dependency of the Linux kernel is gcc, and as far as I perceive, communication between Linux and gcc projects is alive.
I think there is a lesson that can be learned in particular from C development: There is value in growing a system vs mainly integrating existing stuff (junk or not). The larger a project gets, the more sense it makes for it to bring its own tools and implementations so it can continue to build and be maintained as a whole without a lot of complications.
This might be the difference between a huge project and an ecosystem, and the question is where are people aiming with Rust support? If it's just about drivers I can see some limited value in the ecosystem approach but it will be a tough sell for any Rust driver that wants its 'M' in the kernel config.
> Components written with more than just one particular project in mind are tough to change.
Even in the very worst-case scenario, you can fork the code. That would still leave you far ahead of having to write everything you need from scratch. And my experience is that plenty of libraries are happy to support general use cases when at all possible.
> and have to download hundreds of transitive dependencies, and have a good chance of ending up not building a thing
Can you give an example of what you mean by this? I'm unclear what the concern is regarding the idea of "not building a thing".
> as far as I perceive, communication between Linux and gcc projects is alive
The Rust developers also have open and active lines of communication to the Linux developers.
> The larger a project gets, the more sense it makes for it to bring its own tools and implementations so it can continue to build and be maintained as a whole without a lot of complications.
Certainly, but even very large projects benefit from sharing code in foundational areas.
>> Even in the very worst-case scenario, you can fork the code.
You can fork the code that you never knew well enough to write yourself.
>> and have to download hundreds of transitive dependencies, and have a good chance of ending up not building a thing
> Can you give an example of what you mean by this? I'm unclear what the concern is.
My concern is of course that as someone who wants, or is supposed to, mess with a project, I can easily get depressed if I have to invest significant to infinite amounts of energy to bring that project to build. And if I get it to build, have a hard time figuring out how it works because what it does is all over the place, hidden in hundreds of libraries.
> Even very large projects benefit from sharing code in foundational areas.
"Other crates that I'd like: anyhow, bincode, byteorder, log, once_cell, pin-project, rand, serde, slab, static_assertions, uuid plus some more esoteric ones."
> You can fork the code that you never knew well enough to write yourself.
Certainly. It is hard to envision any circumstance where one is worse off for having the opportunity to stand on the shoulders of giants rather than having to invent the universe from scratch.
> I can easily get depressed if I have to invest significant to infinite amounts of energy to bring that project to build
Can you give an example of a Rust project where you have had this experience? Nearly every Rust project is as simple to build as `git clone && cargo build`. The exceptions are those with C dependencies, where you will need to turn to your C package manager (apt, etc.) to first install the C dependencies, and it's hard to claim this as a weakness of Rust relative to C when it's no worse than what you'd get in a pure C project.
> Other crates that I'd like:
Can you be more specific about which of these crates provides functionality which you do not think qualifies as generally foundational?
> Can you give an example of a Rust project where you have had this experience?
Trying to make a better experience I just randomly picked "Weylus" from Github's rust/trending list and wasted more than 1 hour building it on Windows 10 and Debian Bullseye, ultimately quitting both. Yes, it was a lot of C library related issues, but also other executables missing, like tsc, or in case of Windows, make. On both systems, the number of transitive dependencies was about 300, and so I guess it was inevitable that there were going to be some problems.
I've had similar problems before that trying to build other, less ambitious programs than Weylus using cargo.
> it's hard to claim this as a weakness of Rust relative to C when it's no worse than what you'd get in a pure C project.
It seems to me that making it more reliable to depend on other modules (like cargo does) does not lead to easier to use projects, since it doesn't change the amount of bullshit that most developers and users are willing to endure. This is an instance of Parkinson's law and I think it was already a learning from the node.js ecosystem.
As a consequence, with a system like cargo programs end up having far more dependencies (which is empirically true), but the programs aren't easier to build by a random user. For developers, the situation is changed compared to e.g. C, in that it is easier to add more dependencies before the software becomes unmaintainable. This might shift the development situation to a place where the average developer in this ecosystem is more competent in plumbing things than understanding the problems and coming up with reliable solutions that are easy to maintain without a system like cargo.
I'm not sure if this is good or bad - probably it's an evolution that can't be stopped, and which creates new types of developers. What I'm saying is I'm not positive that this attitude is compatible with a central piece of infrastructure, like the Linux kernel is.
> Can you be more specific about which of these crates provides functionality which you do not think qualifies as generally foundational?
I'm not a Rust developer and I don't know any of these, but looking at those names I'm wondering: which is not some arbitrary helper thing, where it would be easy to maintain an alternative in-tree? At least that's better than each developer bringing their own slightly different preference, resulting in more bloat and maintainability problems.
Tangetial question: Seen all the recent supply chain attacks, is there a way to defend FOSS projects with something like project defined capabilities for every third party dependency? So eg. you could include a math lib and mask everything regarding filesystem or network, etc? Especially for rust i would like to see a holisitic solution to this growing thread, facing the accelerating trend of central repositories.
On the impl side, WebAssembly was built from the beginning to support that use case, and many languages are adding backend support for it. On the language side, I really like type systems with Algebraic Effects (koka-lang.org has some of the cool research into it).
This is more plausible for a language with a runtime like Python. But Rust is fundamentally designed to be a systems language where you have full access to everything (and, if you use unsafe, raw access to memory and arbitrary code execution). It’s hard to imagine how you’d add a sandboxing layer to the language, it seems more like something the OS would have to do for you.
On a broad scope, you could solve this at compile time. The source simply does not compile when filesystem or networking crates/builtins are not defined.
If you want to have more fine grained white listing, like only grant access to a certain directory, this could get really messy quick, trying to solve this at compile time.
You'd probably have to start by banning unsafe code in general, but then whitelisting/allowlisting specific versions of specific crates that are allowed to use it, so that at least the most popular dependencies don't break.
My rough understanding is that Caja for JavaScript kind of did that. Libraries didn't have access to the whole scope, could only access what they were given via capabilities. https://en.wikipedia.org/wiki/Caja_project
I feel like that is very hard to make work securely when the boundries are so ill defined. At the end of the day, untrusted code is untrusted. What happens when you use your math library to do some math on a security critical value?
> When was the last time a major distribution found a backdoor in a popular package?
Packagers not finding a backdoor doesn't mean that there isn't one. How many packagers actively audit the code they support for a given distro? It is not uncommon for distros that support esoteric platforms will claim a given package works for that platform because it compiles, but it reliably segfaults on execution. Who's responsible for that? Packagers have even introduced[1] vulnerabilities by "fixing" code they didn't fully understand at the time.
Packagers have a difficult, thankless task, and we're doing them no favors by being confused at what their job is. They ensure that the package builds, integrates with the rest of the distribution as much as possible and updates/patches swiftly when issues are found upstream.
The kernel today has its own complete standard library of data structures and threading primitives for C. In general I find these much more well-designed and pleasant to use than the user-space C standard library. Give me work queues and kfifos over pthreads any time!
Maybe the rust devs can also learn something worthwhile from working with the kernel community.
Some constructs from the kernel are maybe not so relevant to use in userspace.
Spinlocks are great in the kernel where we can just mask interrupts. Not so fantastic in userspace.
Intrusive linked structures are necessary in the kernel since they let us manipulate collections without allocating. But they are also less convenient. Etc.
However, I'm sure there are lots of things which could be useful in both places. I have from time to time toyed with the idea of porting the kernel "standard library" API into userspace. Maybe something like that already exists?
I'm not aware of a version of it available in user space, but at least there is a useful subset which could probably be extracted. At the same time though, having two consumers of the API means more thought needs to go into changes.
To reuse the migrant melting pot metaphor used at the end of the article, the "kernel has to stand alone" seems to mirror the states wanting a "strong border" (while the idea of borders and states has a beginning and will surely end).
I can imagine there are plenty of cases where people build a linux kernel that was moved to an air-gaped server over sneaker-net. Or someone who wants to build linux on some SoC that doesn't even really have an internet connection.
I think in your view in melting pot metaphore 'Kernel has to stand alone' might be a lot like christian countries holding the view "Sunday is a forced day of because it is the day of the lord". Whereas I think it might be more like e.g. the Netherlands having the cultural idea of "we should actually work together to prevent the entire country flooding". Or maybe the Spanish "taking a break during midday is smart, not lazy". Its a cultural preference that arises as an adaption to the specific geographic reality of the country.
It makes it easier to get a kernel working on a bare bones system for one. If things get complex enough, step one of getting Linux working on a new architecture will always involve getting a cross compiler working.
Why is this such a trope in this thread? It's such a superficial strawman. Obviously cargo can copy all the dependency sources into a project's own tree if that's the right fit for a given project, and it almost definitely would be for the kernel.
There are lots of good arguments against pulling in lots of rust dependencies - it's too much code to audit, maintained by too many different upstream people and teams, lots more tooling would be needed to make sure all the dependencies are safe to use in the kernel, and on and on - but this one about an internet connection is both the most common one I've seen here and the most superficial and frankly silly.
The article itself explicitly mentioning 'cargo vendor' (which, indeed, seems like the only logical choice for the kernel, just like they already do with zstd as also mentioned in the article) leaves me extremely confused as to why people keep jumping to assuming non-vendored crate usage.
It seems like people don't know that it's possible? But obviously it's possible to pull dependency source code into the tree... The only question is whether or not there is tooling support for it. It could only possibly be a bonus to have something like cargo that can automate that workflow, rather than being forced to do it manually.
So I agree, this particular criticism is very confusing. Though as I said, that doesn't mean no criticism is warranted!
It doesn't have to, you can "cargo vendor" all your deps (I do so for some of the things I work on, regularly building and developing without an internet connection).
The argument isn’t that they don’t exist or are not published, it’s that they are not published widely.
Given that the circulation for Linuxformat was just 19,000 in 2014, it seems that that number backs up the position that you wouldn’t find this at a street level vendor.
And what do you suppose is the ratio between train stations + shopping malls vs actual street level tobacco and magazine stands/shops is in most European countries? 1 to 700? 1 to 1000?
Any way, the whole argument is redundant. Go head to your nearest newsagent and see if they stock it. If you live on top of a train station or a supermarket then you’re due both congratulations and commiserations.
For everyone else… you can’t buy them at your nearest kiosk.
Street level magazine kiosks sell magazines with up to date Linux kernel CD roms? That’s a lie. I’ve literally never seen that anywhere in Europe in the last decade. You might be able to find them in specialist shops or larger supermarkets, as you can in the UK, but street level kiosks? Get real.
I'm in france, I regularly buy one of those when I take the train lol.
We have quite a bit of choice and they all have recent-ish issues: https://www.journaux.fr/linux_informatique_1_0_130.html and you can find at least a couple of them in most kiosks ; at least Linux Identity always comes with a physical disk
And you don't have any friends & family that regularly come for help ?
When I built my current PC, I didn't bother with a 3.5" floppy drive, but I still had to get an USB one a couple of years ago when an acquaintance showed up needing to read some files from them...
In certain environments i have extensively worked in, the machines on which one builds are only allowed internet access on an ip:port basis after months long process involving dozens of people in multiple teams.
Many people download once, use constantly and on many machines.
My dev PC was never online since it was put together, all patching and updating was performed offline. All builds were bit-to-bit reproducible.
Still makes no sense to me to spend energy into pushing Rust into Linux instead of creating some new better kernels, I mean having some of this big companies paying a few experts to do such a kernel and keep compatibility with user applications. If Rust is much better then C and adding on top the fact you start fresh with no baggage you should get a better Linux and everyone would use it since is safer and faster especially on servers.
I don't know of such a Rust kernel being worked on that is not some hobby side project but something more like Servo was, where you have paid experienced in domain developers.
Sorry for the snark in advance. Do you have an idea how many man years of work are in Linux? The estimated cost of developing it was 1.4 billion dollars in 2008.
Not that I do not have respect for Linux, but that doesn't sound like much in an age when we're reading about modern aristocracy buying up useless microblogging platforms for forty-plus billions. :)
I think the point is that if the Linux kernel "costs" say 4 billion to make, that it's perfectly within the capabilities of a company if they would see the value in doing so - which has been greatly reduced because Linux works so well for so many people and companies.
But if Walmart needed a kernel for some reason, and it would provide value to them, they could afford it, it's about a quarter's worth of profit. But, again, Linux would likely do anything they needed, and if it was missing something they could add it.
the monetary cost is misleading. you can't take 4 billion dollars and use it to hire developers for a year and then expect something equivalent to the linux kernel after a year. money doesn't actually translate on its own into valuable things - people have to actually do work to make that possible. and not every task is infinitely divisible such that it can be worked on in parallel by many hands.
the actual cost is in person-years of work, not dollars, and even then you can only speed up the time it takes so much by throwing money at the problem.
>money doesn't actually translate on its own into valuable things - people have to actually do work to make that possible.
This is highly unintuitive. I don't understand how Windows's Explorer (the file manager) has been for many years and is still to this day vastly inferior to Dolphin, a Linux file manager made for free by five Dutch guys or whatever. I can think of many other examples.
If anything, it seems like more money results in less quality, by way of some tragic sociological paradox. :p
when people make things because they want to, they make things that other people actually want to use. when capital directs people to make things, they do so for the purpose of creating more capital. we pretend that "making money" and "making something useful" are synonymous but they're not. look at advertising - an entire industry where nothing useful is produced but billions of dollars are spent simply manipulating people's wants and needs. at some scale, people stop being able to self-fund these projects - it's when that threshold is crossed that people start making what seem like truly bizarre decisions to hamstring the use-value of their own products, because they're instead trying to maximize their exchange value instead.
Sure , but that all those years of work are not of creating new stuff, there is a lot of work on just modifying existing code and not breaking stuff.
You do not need to get all Linux features at start, I would focus on the server stuff, support the popular server hardware and popular server work so filesystems and networking. I would hope some competent developers would write a new kernel from scratch then getting 10% of some Linux rustiffied. When you edit shit you need to make sure not break shit, so you are limited, you move slow and you have to implement backwards compatibility. Amazon,Google,Facebook could work on this if they care for security but probably using VMs is cheaper for them.
Of course the Linux Foundation would say that the cost of developing Linux is a cool billion. They make money from Linux training after all. They want you to know: You're getting your money's worth.
"starting fresh with no baggage" also means starting fresh with no drivers. Having drivers for a wide range of hardware is a major factor in operating system adoption.
I was not suggesting this kernel to be used on laptops, it would be a server kernel, if it would actually be faster and safer then Google and Amazon will use it on their servers and they will make sure there are drivers, Google could use it in some mobile devices too if they care for safety. AMD has an open source driver so if desired that could be ported to Rust to also make sure the GPU is safe.
Things work the opposite way. In order to get companies to adopt things you need to get people working their interested in it. Which means they need to be able to run it on their PCs. There's a reason why x86 ate the world.
The exception to this is when you can develop something and have a good case to use it yourself, like Google and Fuchsia. But I doubt Fuchsia will gain wide organic adoption because of the same factors.
>Things work the opposite way. In order to get companies to adopt things you need to get people working their interested in it.
This would mean that the only OSes are first created by hobbyists, then fanboys use it then a company adopts it. But there are commercial OSes and commercial software that are created by a company to solve a problem and not by random dudes and companies then adopts it.
IMO is sad that we the tech industry are stuck with old OSes and old architecture while you have this giants that use open source software to make billions and they could find researching and creating of a few new revolutionary OSes like it happened in the past. And I don't mean desktop OS that will defeat Windows.
Name a single successful (non-niche, stuff like INTEGRITY doesn't count) OS kernel created post-Windows NT 3.1 that's commercial. Even Mac OS X still runs Mach as a kernel. The model is dead, because of the internet. OSes have MASSIVE network effects.
I don't think you understand the scope of what you're talking about. Linux is tens of millions of lines of code, most of which is drivers that have to be written by vendors.
Vendors aren't going to write drivers for your hobby kernel. No one is using your hobby kernel. Bootstrapping a new kernel without billions of dollars to invest in development time is almost impossible, and anyone who is investing billions of dollars is likely going to have dubious proprietary reasons for doing so.
A successful kernel is Rust is probably the worst thing that could happen to the open source community.
>I don't think you understand the scope of what you're talking about. Linux is tens of millions of lines of code, most of which is drivers that have to be written by vendors.
But if you are Google and you believe that this Rust kernel is super safe and fast and has clean code, and parallelism,async and candy ... how many drivers do google Data centers use ?
After you prove the kernel is real good by using it in data centers and devices that need security you can slowly expand, for most devices you could create soem compatibility layer. Who knows if there are soem competent developeres hired to work on it they might use some better architecture, like keep the drivers outside the kernel.
Not very easily: Linux quite deliberately does not have any stable internal driver interface, so any such compatibility layer would have a very fast moving target to keep up with.
We simply don't have better kernels. POSIX is the epitome of infrastructure. That's it. We did it. It's like asking for something more than Turing completeness - we have everything, just optimise the existing code. To that end, introducing Rust is a pessimization.
It would be good if APIs like open and creat existed that took something safer than null-terminated strings.
Even if kernel devs are careful to avoid the usual pitfalls, it's just a bad API that encourages use of null-terminated strings elsewhere. If you're using other, better string types elsewhere, you often need to make copies to then make system calls (the article hints at that with the addition of a CString type).
What's wrong with null termination? No, seriously. It's much cleaner than wasting an argument, hence CPU register, for a second pointer. Or heavens forbid, two size_t length and capacity counters.
Other than the usual arguments [1], I just said that it causes contamination of other APIs.
That's a big problem because string is a "vocabulary type". It is passed around across various libraries within a given process constantly. There should be a lot of convergence on vocabulary types to avoid copying string data around incessantly.
Null terminated strings are a poor choice to converge on for high level application code in particular. It's just an inefficient (expensive substrings, redundant length calculation, unclear ownership) and unsafe default for too many use cases.
I don't mind certain performance sensitive applications using C strings to avoid extra registers, though even that is a premature optimization in certain situations.
The question is the problem: how do you do zero-copy sharing of data contained within a string. You can’t unless you couple it with the length. In rust parlance a slice is a well defined type.
There are also other reasons why having the length embedded in a string (or a string slice) is a good thing. You might want your “str_contains” function to do something different with different sized inputs: doing some vectorised lookup might only be worth it at a certain point, or if the length of the needle is greater than the haystack itself then there isn’t much point doing anything.
NULL terminated strings are a huge mistake that brings in security issues, needless copying and inefficient code that might contain several redundant strlen calls at different levels.
The basic POSIX interface works, but it's hardly optimal. It has a lot of weird built-in assumptions that we're just so used to working around that it's hard to imagine it being any other way.
As an example: why can a process only have one current working directory? Wouldn't it be nice to be able to have a process maintain pointers to two or more locations in the filesystem at once? Wouldn't it make software more modular if a library could "chdir" into some directory without worrying about breaking the application that depends on it? The filesystem APIs could be extended with a "CWD" handle argument that can be passed around sort of like a file descriptor instead of having one implicit CWD for the whole process.
The same could be done with UIDs. Why not have processes that can use multiple UIDs? Again, you could have UID parameters to API functions that require authorization.
POSIX is pretty good by the standards of 80's computers, but in a lot of ways it's showing its age. We can do better. But it's kind of depressing that OS interface design is treated as a solved problem, and so those interfaces stagnate.
It's a consequence of living in a society where everything is about money. You absolutely could build a better operating system, but doing so wouldn't make you any money, so nobody can afford to do it.
What's the alternative? To use absolute paths for everything? That seems kind of tedious, and may be slightly less performant if the kernel has to do more dentry lookups.
I think the idea of a current working directory is a reasonable one, it's just that the limitation that a process can only have one CWD at a time is kind of arbitrary when you think about it.
I could even see having commands that take multiple CWDs. Like a move operation could take a source and a destination CWD as an alternative to specifying source and destination paths.
Directory file descriptors already exist. I wouldn't call them "current working" directories though. Are you suggesting an expansion of those, or something different from those?
This should be a hint to at least some of them that we need a new operating system rather than dragging the old one kicking and screaming into the 21st century.
Rust is in an interesting position, where like C or C++, it's capable of solving very low-level problems. At the same time, it's got a lot of "modern" language features.
The kernel being literally the "bottom" of a system presents a challenge as this space rejects complexity and tooling variety that is intrinsic to "higher-level" languages.
Could you say in concrete terms what you think is not going to work? This just feels like a jumble of spatial metaphors. I don't see why it matters whether Rust or C are 'high' or 'low' level languages, or how that relates to the kernel being at the 'bottom' of the operating system.
I don't think people should be looking at adding another language to the kernel, mostly because every OS build on it has been a disaster. We just need approach and definition for operating systems.
> I don't think people should be looking at adding another language to the kernel, mostly because every OS build on it has been a disaster. We just need approach and definition for operating systems.
What does "we need approach and definition for operating systems" actually mean? (If you know, that is.)
Also, it's the Linux kernel, not just the kernel. There are more than one, and every operating system has one[0]. You might be interested in looking at the Mach kernel (Apple[1]) or Google's new Zircon kernel (in Fuchsia).
Both of those are microkernels, and as such minimise the amount of work that the kernel does, as well as removing drivers to userspace. Inasmuch as I can extract any meaning from your comment, it seems like your problem might be with the size and scope of Linux's kernel, in which case those might appeal to you.
[0] I'm sure this absolute claim will summon someone to point out some recondite 1980s operating system that doesn't have a kernel.
> Some of them have been directly involved in cancelling non-native English speakers over pronouns.
This is the kind of claim that could do with a source. In general, it seems Rust is rewarding enough and complex enough to master that people just don't pay much attention to the silliest kinds of 'activism' compared to other dev communities.
Rust’s cavalier attitude to language and compiler stability, their absurd bootstrapping situation and limited platform support, not to mention their belief that “curl something|bash” is acceptable procedure are all reasons why I’ve avoided it despite the many good qualities of the language.
Rust has been a very stable language since Rust 1.0. They have a stellar record of keeping things working - with most code breaking being due to said code invoking UB. The edition system is a brilliant invention that allows evolving the language _without_ causing an ecosystem split. Thanks to this, Rust ends up having a much better stability story than even C++ (for whom you can't really mix and match different C++ versions).
The bootstrapping situation is really not that bad? We have mrustc (A C++ rust->C transpiler) which allows compiling modern versions of rustc (latest supported rustc version being 1.54), which we can then iteratively bootstrap from up to the latest version. And things are getting better, with gccrs[0] in particular promising a rust frontend for GCC, written in C.
As for the "curl something|bash", I suppose you're talking about rustup. You're free to download the script, and review it before installing it. And rust is also distributed many different ways. At least `curl something|bash` does not require root account, unlike `sudo apt install`, which can be very convenient. Like all things: Multiple options are generally better.
Right, so you basically have to replay nearly the entire history of Rust versions since 1.54 (that’s what, 6 or 7 stages?) to bootstrap. Compare this to Go, where there is a stable version of Go 1.4 for bootstrapping, and the current Go 1.18 compiles on it, as opposed to only the previous version of Rust in rustc, so 2 stages, or Zig which can be bootstrapped in a single phase, I believe. That is what I call lack of stability.
...and why should I care that it's somewhat more inconvenient to bootstrap the compiler from scratch? No, seriously, why? What I care about is that the code I wrote on Rust 1.0 still compiles on Rust 1.60. And I do still have code from back then (I started writing Rust just before 1.0 hit) and I can confirm that it still compiles.
Yes, I know it sucks for all of the distribution maintainers who want to bootstrap every package from scratch, and I do feel for them, but that's a very niche thing to do which the vast majority of people will not do.
Try it. Here's a completely reasonable line of C++ 17 code:
int concept = 4;
Now here's a completely reasonable line of C++ 20 code:
template <class T> concept delicious = true;
Huh. You can't have those in the same project because C++ 20 believes you can't name a variable "concept" and C++ 17 believes "concept" is an identifier and so you can't write your delicious concept template.
These are both valid C++, it's just that they aren't simultaneously valid in any of the half dozen distinct versions of standard C++.
C fixes this by using _Concept as the keyword and sticking a #define concept _Concept in a separate stdconcept.h header file. Or maybe it was about complex or whatever, but same deal.
In what sense is Rust 1.0 not already an LTS release ?
If you wrote some Rust 1.0 code back in 2015, put it away in a drawer (maybe on a USB stick) but now get it out today it will still build with current Rust tooling -- except for some very narrow cases where you might have done something inherently unsound and subsequently the compiler was corrected to fix that so it's an error.
Rust's language shifted slightly in those years. However the Editions system - even though it hadn't yet been invented in 2015 - allows for that. Your 2015 code lacks any Edition metadata, so, the modern tools understand that as Rust 2015 edition, and will compile accordingly, while still interoperating correctly with modern Rust.
Suppose your 2015 code named a variable await. That's a keyword in modern Rust. But it isn't a keyword in Rust 2015 edition, so your code compiles just fine. This would not work in C of course, which is why its newer keywords are ugly stuff like _Bool but in Rust it's fine.
Modern releases of GCC still support C89. If rustc, at least, will keep supporting 1.0 features for all future releases, I guess that's fine.
But that also means that crates used by the kernel would need to likewise be conservative about updating to new Editions so as to not break expected support surfaces.
So, firstly the editions system means it doesn't even matter about crates using a different edition. Remember that await variable in the Rust 1.0 code above? My Rust 2021 code can still talk about that variable, even though it's from a different edition and the word "await" is now a keyword, it just calls that variable r#await meaning "the identifier named await" - which is a bit ugly but gets the job done for interoperability purposes.
But also, all previous versions of published crates are kept indefinitely. If Linux wants serde v1.0.240 then that's fine, even if subsequently serde shipped v1.0.241, v1.1.14 and v2.0.1 the repository holds on to everything.
It matters if certain drivers require certain toolchains. C89 is portable across all sorts of C toolchains. If the Linux kernel added C17 features, only toolchains that support C17 could compile the kernel.
You're correct that the kernel codebase could pin older versions of crates when it is appropriate, but it's never quite that simple at scale, especially if the kernel pulls in more than a handful of crates.
Even before that, the minimum version of GCC required to compile the kernel has risen several times. The current minimum is GCC 5.1, so the GCC 4.9 which you could use to compile the kernel one year ago is no longer good enough.
Good clarification about the move to C11 and GCC 5.
But the move is still deliberately done for the kernel codebase as a central decision. It would be different if they dropped support for older compilers because the 'foobar' crate started using a shiny new Rust feature and forced the issue.
I already loathe the fact that I need Perl to compile the kernel just because people are too lazy to rewrite a few parsing scripts in C. The moment Rust is introduced in Linux without being a complete replacement is the day I leave for greener pastures
I agree on the Perl requirement bit but see Rust as a complete replacement for C or at least great alternative in systems engineering domain in the long term.
Please read again my comment. If tomorrow all the Linux C codebased switched to Rust I would have almost zero problems about it. My problem stems from having multiple toolchains to build the foundations of an operating system.
It might, and if so you will switch but a very small minority of Linux users ever have the need to compile it themselves and out of that small minority only a small fraction will migrate to alternatives over this.
I see Rust as a replacement for C long term. From this perspective it's absolutely ok to have a transition period (years in fact) where there are multiple toolchains used to build the kernel.
Because there are many people who oppose mandatory Rust integration. For various reasons, not just plain dislike of Rust itself. And, for example, there's a Libre Kernel that is blob free and it has its users. There might be an audience for Rustless Kernel too.
I think they meant "why?" as in "what reason do they have?", rather than "do there exist people who think that?". I don't really care what tool was used to generate my assembly, and the kind of Rust that will be written in a kernel context[0] will - I guarantee you - produce virtually identical assembly to C.
[0] i.e. no standard library ('no_std'), no unwinding panics, no dynamically sized types, &c.
Just as web developers, of all people, are finally, after over a decade, coming to the conclusion that arbitrarily-packaged dependencies are a bad idea for so many obvious reasons, Rust developers are trying to replicate NPM in the kernel.
I was told that Rust in the kernel would be a good thing, and that not much would change. If these are the types of people who are writing Rust for the kernel... can we go back? I really want to go back.
You could argue that there are some crates in the Rust ecosystem that suffer from a deep dependency tree, I don't think you can argue that Rust developers are trying to do this for the kernel.
The article states:
> The Rust-for-Linux developers understand this situation and are not envisioning adding the ability to pull in modules with a tool like Cargo
> You could argue that there are some crates in the Rust ecosystem that suffer from a deep dependency tree
Some? Dependency graphs of well over a hundred packages are ubiquitous. I've long since stopped being surprised when I compile a rust package that makes no network connections and see it (indirectly) pulling in multiple HTTPS libraries.
Right, great example. I think 'understand the situation' refers to this. I imagine the Rust for Linux developers would be going through any dependencies they pull in with a fine comb, in this case a dependency that pulls in HTTPS libraries for presumably no reason should be rejected. If that particular dependency makes it into the kernel, well, then you can start complaining about it. But right now feels a bit premature.
> Just as web developers, of all people, are finally, after over a decade, coming to the conclusion that arbitrarily-packaged dependencies are a bad idea for so many obvious reasons, Rust developers are trying to replicate NPM in the kernel.
Web developers and Rust developers are the same group
I'm sure there are web developers that are also Rust developers and vice versa, but the vast majority of developers I know that belong to either of those groups don't belong to the other.
Given it was born at Mozilla, is getting traction at WebAssembly and WebGPU efforts, WGSL is loosely based on its syntax, and some Cloud Native projects are migrating from Go to Rust, I am quite sure there is some overlapping.
Not that I agree with Rust's adoption at that level, as I think languages with automatic memory management make more sense, but that's me.
What people are roundaboutly observing is that the vast majority of developers these days either begin their career or spend at least a part of their career as web developers. That Rust is such a successful onramp to systems development even for people who haven't been classically trained in it is one of the strengths of the language.
While it is good that such people are embracing Rust for such purposes, they would have had a similar high level experience with something like Modula-2 or Object Pascal.
If anything, it is the pseudo-macro Assembler approach to C's design that made it a scary experience for some developers.
"related to" is doing a lot of work there. Browsers are "related to" the web, but surely you don't think Mozilla, if hiring for someone to work on the CSS layout engine or something for Firefox in c++ or rust, would advertise for a "web developer". It's just not the same job in any way.
Not any sort of developer myself but if you're writing an OS kernel, and for what it's worth, my opinion is you should know the library's your code relies upon is safe. I have to agree with the part of the article taking about bloat and security problems.
I can't think of anything that could more effectively discredit the effort to use rust in the Linux kernel than suggesting using crates.io/cargo -- I had to check the dates just to make sure this wasn't a delayed April first post.
But I guess some people just really want the unicode ukrainian flags being appended to every email they send or something and are too lazy to implement it themselves. :P
Why does a single bad idea discredit an entire language being included?
If someone suggested including a C dependency manager in the kernel, does that mean that the use of the entire C language in kernel should be discredited?
It is fair to say that the developer suggesting it (especially considering some of the proposed crates they want) has some massive misconceptions about the constraints of current Linux kernel development though.
Yeah, my stance on this is that it probably makes sense to pull some libraries into the tree that would be useful in kernel development and which have comprehensible dependency trees such that everything could be reviewed and audited. But then I saw the quoted list and thought "oh this doesn't seem like a very serious proposal".
Same thought. Seriously, what a great way to trigger some senior kernel dev to come down and straight veto rust with some rant about how it’s “worse C++”
Rust is very slowly winning (but it’s a really hard space and this is impressive). But some people have to come and try snatching defeat from the jaws of victory.
Indeed. The comment cited in the article is pretty stunning:
> Other crates that I'd like: anyhow, bincode, byteorder, log, once_cell, pin-project, rand, serde, slab, static_assertions, uuid plus some more esoteric ones.
This makes Rust devs look infantile, with no real understanding of what it means to develop for the kernel. Or what it takes to develop security and performance critical software of any kind, frankly.
Can you please elaborate? These crates are no_std and provide functionality which seems useful regardless of on the kernel or not.
I don't understand why it looks bad from a performance or security point of view. These crate are well established and developed by competant devs that also care about security and performance.
anyhow is for careless error propagation, where you know you won't do elaborate handling and just bubble it up to the user. It's suitable for (some) applications, but not for libraries, and a kernel falls on the "library" side of that divide. Errors should be handled by the kernel itself or reported carefully to userspace. anyhow is not built for that. (It also puts its errors on the heap—what do you do if memory allocation fails when reporting an error?)
Linux has its own UUID infrastructure, so it seems weird to pull in a whole parallel implementation. At the very least you'd want to disable all the generation code, and it's unclear to me if the uuid crate supports that.
rand is large. Does Linux need all the probability distributions it supports? (ripgrepping for "Bernoulli" only turns up the name of a floppy disk system, so probably not.) Doesn't it already have an implementation for the ones it does need?
> It's suitable for (some) applications, but not for libraries, and a kernel falls on the "library" side of that divide.
I would disagree with that assertion: the kernel is a service that end user applications interact with through an API. Using anyhow in a Rust library that Rust applications use is a very bad idea. Using anyhow in the kernel might be a bad idea, but not for that stated reason.
> Linux has its own UUID infrastructure, so it seems weird to pull in a whole parallel implementation. At the very least you'd want to disable all the generation code, and it's unclear to me if the uuid crate supports that.
The uuid crate can be modified to support the kernel infrastructure as its backend. This would allow the same API to be available within the kernel and in the rest of the ecosystem. That's beneficial to everyone involved.
> the kernel is a service that end user applications interact with through an API.
My thinking is that that API has to distinguish between many strictly defined error codes, while anyhow homogenizes errors and focuses on human-readable information. How would you tell whether an anyhow::Error means EINVAL or EFAULT?
It's the same fundamental design tradeoff that makes it unsuitable for Rust libraries, even if anyhow wouldn't be visible in the public API.
> The uuid crate can be modified to support the kernel infrastructure as its backend
Fair enough, using a modified version seems reasonable.
Actually all of those crates could potentially be implemented in the kernel and be useful. The current implementations wouldn't work as-is, in many cases, but the APIs make sense and the crate could use conditional compilation to work in the kernel.
The most far-fetched crate there would be pin-project I think, because a lot of other work would have to be done to make Rust async code work in the kernel.
>The most far-fetched crate there would be pin-project I think, because a lot of other work would have to be done to make Rust async code work in the kernel.
Pin's usefulness is not limited to async. Any type that contains a pointer relative to itself benefits from being wrapped in it to avoid accidental misuse.
Most Rust "crates" are compile-time convenience features that add comparatively little binary code to the final build. The requests are plenty reasonable from that POV, although "vendoring" the deps might still introduce some complexity.
i mean, serde in the kernel is overkill, but not every crate is like that. Rust has been designed in the internet era and leverages dependencies for things older languages would ship with (rand!)
> Weird justification for missing an extremely central and extraordinarily important feature
Can you think of any reason why decoupling it from the language itself would be a good idea? Why does it need to be baked in, and what problems would arise if it was? How do these problems trade off against having people just run “cargo add rand”?
OMG, here comes a bunch of bloat into the kernel. I like rust (well, the idea and principle) but when you are doing low level stuff, something as trivial as a structure to bind an address then manipulate those bits by writing to them (enabling and disabling features) it becomes a no go. If you even consider how exhaustive it is to manage OBMR at this level of code, it becomes counter productive, not to mention unreadable. Now for the bloat part, take a simple code (one of their examples) which is a showcase of their capabilities and “zero cost” abstraction for something like embedded, their blinky app is like 100KB versus a initialize the clock/ set GPIO to read mode, and write to it that come out shy of 4K in C++ with all the hall form STM for example. Object lifetime analysis is really cool at compile time, but any un defines behavior is still present when you compile the code… Show me someone doing a baremetal implementation of blinky (that is, no HAL) on rust or a struct for bit banging and I’ll be a believer that you can do “some” of the stuff needed for kernel development but not all.
I just compiled the blinky example for a cortex m4 board (the example in the rust hal crate for that board, not using any unsafe), and the striped binary is 420 bytes.
So you probably want to review your stence, because you're about 3 orders of magnitude off.
I am currently working with the ATTiny85 and the stripped blinky binary is 275 bytes.
I was worried that the ATTiny85 wouldn't have enough flash space (6K) for Rust and now I don't know what to do with all that free space :)
Yes its 420 bytes, which is why I’m commenting as my fingers played me on my thoughts. Do the same for one writer in C++ and report back on the orders of magnitudes… Cheers!
Kernel development is hard, and bullshit doesn't go very far in that context. Success for Rust in that environment (with some changes along the way) will be a proof of value.