> and the world would certainly be a better place if everybody coding C or Go switched to Rust
Perhaps. Let's engage in a thought experiment. Sorry for moving slightly off-topic, but the line I quoted made me think about this.
Someone fashions a magic wand, which you can wave over C, C++, and Go programs / libraries to instantly re-materialize them as idiomatic Rust, while preserving all of the "good" output they produce, and simultaneously removing the "bad": all memory safety and data race related bugs they exhibit.
You get to use this magic wand on any program you like, instantaneously. You do so, creating linux-rs, glibc-rs, chromium-rs, etc. in the process. You cargo build all of this new software and replace the old C / C++ versions with it, in-place.
In the brave new Rust-powered software world, does your day-to-day computing experience change? It is materially better?
Speaking for myself, the answer is "no", unfortunately. Perhaps this message is coming from a place of frustration with my own day-to-day computing experience. Most software I use is much more fundamentally broken, in a way in which doesn't seem to be dictated the programming language of choice. The brokenness has to do with poor design, way too many layers of absolutely incomprehensible complexity, incompatibility, and so forth. I don't remember the last time I saw a Linux kernel oops or data corruption on my machine, but I am waiting _seconds_ to type a character into Slack sometimes.
I like most of the ideas behind Rust (I don't like the language itself and some of the choices the authors made, but that is another discussion). However, I think there is only so much you can fix with the shiny and sharp new tools, because it seems to me that most issues have little to do with low level matters of programming language or technology, but with higher level matters of design, taste, tolerance for slowness / brokenness / incompatibility, etc.
Part of the reason your Slack is so slow is that a lot of stuff is built to protect from problems that Rust might eventually solve.
Slack builds the UI on web technology that got widespread in part because it solves awkward problems with deployment (self-contained and consistent graphic libraries, so you don’t have to worry about how your DE compiled this or that other toolkit) and safety (web tech is heavily sandboxed so that crashes and executions won’t open doors to bad actor). In the long run, Rust will definitely make the latter less cumbersome (less worrying about crashes -> simpler, lighter, faster sandboxes) and possibly help with the former a bit (desktop environments and their libraries could shed some complexity when moving to Rust and make it easier for programs to access them safely).
I think it’s a noticeable step forward. Will it solve everything? No, some of the problems with Slack-like situations are due to economic factors (browsers sticking to JS will forever continue to make JS programmers cheaper and more plentiful than basically any other type of programmer) that Rust is unlikely to affect. But perfect is the enemy of good in this sort of thing: incremental progress is better than no progress.
But I think Rust is also quite vulnerable to the layering problem the previous commenter is speaking about. One of the best things about Rust is how easy Cargo makes it to include 3rd party code in a project, but this is also one of Rust's biggest risks. It's already common for Rust projects to have massive lists of dependency, and that's something which generally gets worse as time goes on rather than better.
Rust as a language may have favorable properties with respect to speed and safety, but programs which run on top of a massive tree of third party code which has been written by god-knows-who tend not to be very fast or very secure.
NPM has already shown that dependencies can be used as an attack vector, and unless Rust can solve this problem, I don't think it's going to bring us some brave new world where we don't have to sandbox anymore.
> programs which run on top of a massive tree of third party code which has been written by god-knows-who tend not to be very fast or very secure.
You have a point about security, but not about the speed. I can probably link 5 "we rewrote in Rust and it was much faster" articles. All of these used third party libraries. ripgrep for example, is faster than grep, despite having more dependencies. In reality, it just promotes better code reuse without impacting run time speed. If anything, separating your code into crates improves incremental compilation times.
It's possible that you might pull in a large dependency with many features. Compiling all of this and removing the unused code will cause a compile time penalty and no run time penalty. In practice, Rust crates that expose multiple features have a way to opt-out/opt-in to exactly what you need. No penalty at all. In any case, most rust crates err towards being small and doing one thing well.
I agree that Rust has very favorable characteristics when it comes to performance. My argument would be that language choice is not a panacea. It's certainly possible to write performant code which leans on dependencies, but the style of development which relies heavily on piecing together 3rd party libraries and frameworks without knowledge of their implementation details is not a recipe for optimal performance.
I see this as sort of saying "Imagine you could cure Ebola, is the world a better place? Well, for me, no, I'm much more likely to get hit by a car".
While I am unlikely to be attacked through a memory safety exploit, I also:
* Have been attacked through one in the past, when the internet was a different place
* Wonder how much time and money could be better spent if we just eliminated that entire class of problems - perhaps solving some of those poor design issues?
I think the reason such a magic wand can't exist is actually why it would be a material improvement if it did - it would fix swathes of bugs that rustc would refuse to compile, and that require understanding the application semantics to fix.
I don't know when I last saw a kernel oops or data corruption either, but iu certainly routinely experience bugs that could be manifestations of memory mismanagement.
And if everything written in Java would be transpiled, with no `panic`s, bells, or tracebacks :vom:'ed into the GUI, oh how I'd celebrate.
Would Rust actually solve heartbleed? Most memory safe languages wouldn't have, because it wasn't using regular memory management, it was using a custom memory pool with custom array types that would refernce that pool.
Maybe in many other languages they would have had better alternatives than that implementation, but I'm pretty sure that their implementation could have compiled to valid Rust that would have had the same heartbleed bug.
The most important property of Go for me is that the language is not red/blue. [1]
This enables I/O interfaces to be truly universal, covering just about everything from files on disk, to pipes, to in-memory buffers, to sockets. This feature facilitates a style of, for lack of a better word, generic programming that is hard to come by in other systems.
I find basically any other language or platform except Go lacking and unpleasant in this regard, due to the viral nature of asynchronous functions, which never disappears entirely, no matter how much syntactic sugar is sprinkled on top of it.
Things like mismatch between event reactors or asynchronous frameworks do not exist in Go. Interfaces Just Work, and the entire ecosystem uses them.
> The most important property of Go for me is that the language is not red/blue.
This initially attracted me to Go as well. Unfortunately in production apps your functions get colored by their `context.Context` argument to support cancellation. Unfortunately `Context` is viral because it needs to be passed down from `main` down to virtually all blocking functions.
I've never been a fan of the "what color is your function" essay, because it implies that Go is in some sort of unique space. In fact, Go just uses threads. There's no semantic difference between Go and pthreads. The only difference is that Go has a particularly idiosyncratic userland implementation of them.
While "what colour is your function" essay highlights the author's pet peeve, asynchronous functions, I always understood it to be less about the underlying implementation, but about syntax and semantics; the whole point is that control flow ends up infecting function-level semantics. That problem extends to anything else a language can treat at "coloured".
For example, in Haskell, side-effectful operations end up being "infected" with the IO monad. This means you're not free to mix and match functions — the moment you need to call some IO function, all callers up the stack need to be monadified, too. This might be a late change — suddenly you need a logger or a random number generator, and it has to be passed all the way up from the outermost point that uses monads. In practice, monads are so deeply ingrained in Haskell now that most devs probably don't see this as a colour problem.
Multi-value-returning functions in Go is another example of colour. The only way to use the return value of a Go function that returns a tuple is to assign them:
value, err := saveFile()
if err != nil { ... }
This means functions like these aren't composable. I can't do saveFile().then(success).fail(exit) or whatever, like you can in Rust. The moment you have a function returning more than one value, your only option is to create a variable. It's weird.
Interestingly, you can do this, but I've never done it and never seen it in the wild:
I figure the main feature of goroutines is that it is considered acceptable, for whatever reason, that a library function spawns helper goroutines without telling you (as long as it reigns them in somehow and they don't leak or anything weird like that). If eg. a random C or rust library spawned threads for random tasks without explicitly being a concurrency thing, it would probably raise a lot more eyebrows, no?
In the end this is probably because of the possibly superstitious belief that goroutines are free whereas threads need to be carefully budgeted for, and maybe somewhere between premature optimizations and designing for very niche scalability requirement, but subjectively the result is still that goroutines are "available" in a lot more situations than boring, pedestrian OS-level threads.
I feel like that counts as a semantic difference, even if might be social more than technical.
An implementation with buy-in across the entire ecosystem and language so that you don’t have some systems using threads and other systems using futures and other systems using different reactors, etc.
Also known as the point of the article.
Additionally, that implication is entirely of your own creation. The article explicitly lists many languages besides Go:
> Three more languages that don’t have this problem: Go, Lua, and Ruby.
Last I tried it, if I had a million goroutines calling stat(), Go would attempt to spawn a million kernel threads. So I rolled my own rate limiter, bleh.
Is that still the case? (Happy if not)
If it is still the case, is there a standard solution or is it up to the app author?
It's up to the application author in that case, unfortunately. stat() enters the kernel and resolves in one shot, so it requires a whole thread. I haven't read into it very carefully, but on Linux, perhaps the new io_uring business is going to change this state of affairs. For now, however, you need a semaphore of your own.
I wish I understood what problems people solve day in and day out such that they need to call IO in the middle of Dijkstra’s algorithm. Most "business logic" ought be pure functions over persistent data structures with no IO other than the occasional logging. At the last possible moment, 5% of the code wires in IO in some boilerplaty way. Is that sufficiently pervasive to worry about it?
> what problems people solve that they need to call IO in the middle of Dijkstra’s algorithm.
I worked on several problems of this nature at Twitter in 2012. Hopefully there’s a better way to solve them in 2019...prolly not, but maybe.
say you want to find the median of the number of followers a person on twitter has. so that should be easy - make 1 dataframe with follower count of each bloke and call median() - well, there’s some 300,000,000 blokes, so not that easy :) You have to make a dataframe via ETL - reading & writing to disk 100s of times, loading a few thousand users each time, distributed median computation. so a silly sub-second median query took 2 months to code up & debug & ran for a few hours due to so much IO.
another much harder problem - you want to find the median number of hops between one user & another. so now you have 300m x 300m tuples as your result - where & how to store them is in itself a monstrous challenge. but how the heck do you even compute the result ? you read in one tweet from john to steve, so that’s 1 hop from john to steve & viceversa. you then read a second tweet from steve to mary, so that’s 1 hop from steve to mary & viceversa, 2 hops from john to mary & viceversa. in this manner you read 100s of billions of tweets & keep updating hopcount. somewhere in there john sends mary a tweet - oh fuck now the hopcount is 1, not 2. this will then change lots of other hopcounts. in theory there are nice graph algos for this sort of thing. but in reality, your data is billions of tweets constantly increasing, stored in distributed compute clusters across the planet & just getting a handle on all this can be a 6 month project for some lucky scientist who got to work on this.
> I worked on several problems of this nature at Twitter in 2012. Hopefully there’s a better way to solve them in 2019...prolly not, but maybe.
Okay, so Twitter has scale. To a first, second and third order engineering approximation--nobody else does.
If you are a mere mortal writing practically anything, pull it all into memory, operate on it to create another copy, destroy the original copy (or let GC kill it).
Embedded programmers might get a pass on this given limited memory (32K RAM)--but that same kind of attitude is getting more and more essential as you start getting Big/Little core mixes on the same chip.
Computers are mind-bogglingly powerful.
I have been completely stunned at how many transactions Nginix+Django+PostgreSQL can actually handle before you need to start thinking about "scaling".
> I wish I understood what problems people solve day in and day out such that they need to call IO in the middle of Dijkstra’s algorithm.
For Dijkstra, imagine a very large graph that cannot fit into memory, where you'll need to go out to disk or network to compute the distance of two nodes (or fetch part of the graph, etc).
The business logic many (perhaps most) programmers work on is primarily occupied with gluing together multiple state stores (databases, caches, message queues), running some very simple computations, and writing the output to some IO sink (often a web response).
Sure, you could extract the computations part out. But that barely moves the needle on testability/cleanliness, because most of the business (or business-value-driving) logic is the data flow management--highly stateful IO coordination.
I/O, both disk and network, can happen almost anywhere because data doesn't always fit in memory, may be streamed from a remote source, and I/O handling cannot always be deferred indefinitely. A significant part of systems software design is accounting for this reality.
One of the primary issues I run into this is that some parts of the domain logic determine what data you need from the database and this logic can't be easily moved into sql(or however you're specifying what data you need back from the datasource).
That link has a great punchline. "All of this is easy with threads which never cause any problems!" I haven't tried Go, but this seems certain to be an exaggeration.
> I find basically any other language or platform except Go lacking and unpleasant in this regard, due to the viral nature of asynchronous functions
That's...odd, because while Go lacks that issue, it was by no means unique among languages or frameworks in that regard when it was introduced, and isn't now, either.
This is the same reason why I love C, and why I think that, for projects larger than a 1000 lines, it allows me to be MORE productive than more sophisticated languages. Possibly even resulting in a smaller linecount, but at the very least less accidental complexity.
One small red/blue pain point in C is varargs - for most varargs functions I have to make a "..." and a "va_list" version.
When using a C library, who's code is responsible for allocation objects? Who's is responsible for freeing said objects? That alone leads to a LOT of issues.
Seriously, all these language mechanisms like RAII make it a lot harder to write modular code. Look at the mess that C++ got itself in, with its constructors, default constructors, move constructors, rvalue references, exceptions (required as a consequence of constructors), and what not. It's hard to believe Rust could make it significantly less painful.
Programming is mostly not about initialization and deinitialization. If it is, you're doing it wrong, you have too many small objects.
Yes, stack allocated STL containers can be nice for quick "scripting". But I will happily write a few function local deinitialisations to enjoy much less convoluted and interdependent, slowly compiling code.
I like my HTTP package to actually speak as good an approximation of HTTP as it can, so fasthttp wouldn't be my first option.
Furthermore, the HTTP stack doesn't show up on most web servers' profiles. net/http is better understood by the community, interoperates with libraries better (e.g. golang.org/x/oauth2), and is better supported.
The main issue with net/http is that it generates a lot of garbage which is definitely going to show up in your profile if you're handling a lot of requests per second.
Allocation pressure imposed by net/http has never been a top K source of profile CPU burn in any service I’ve ever written, and I’ve written plenty of high-performance, high-QPS services. fasthttp is rarely if ever a good idea.
It opens a ~190 MB JSON file[0] in about 2 seconds on my machine. Scrolling and jumping around the file feels very smooth. Editing is also crisp. Writing the file back to disk takes about as long as opening it.
I don't see any problems with .git. Most go packages use git anyway, it would be pretty amazing if go tooling would break if you have a .git directory anywhere.
Re the other question: If you don't run a go command with a path name, it doesn't matter what the directory structure is. I have tons of non-go projects in my GOPATH and build them with the usual tools. No issues.
I also set GOPATH=$HOME, and all my source code (not only Go) lives in $HOME/src, and I do use git for both Go and non-Go projects (so they have a .git directory in $HOME/src/what/ever/.git), and there's no problem. Why would there be one?
> [...] when /etc/resolv.conf or /etc/nsswitch.conf specify the use of features that the Go resolver does not implement, and when the name being looked up ends in .local or is an mDNS name.
Those things you can't check at compile time.
In any case, you can tell whether the binary is statically linked or not.
Statically linked:
$ file /usr/bin/consul
/usr/bin/consul: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
$ ldd /usr/bin/consul
not a dynamic executable