A bunch of comments seem to be comparing and contrasting CMake+Ninja vs. Bazel. Just my two cents, but I think that's missing the real point of this work.
It isn't about whether Bazel is a better build system for LLVM. At least, that isn't my motivation.
There are users of LLVM's libraries that use Bazel. Whether for good or bad reasons, it is extremely useful to enable them to use LLVM's libraries with a fully native Bazel build. That is my motivation: enabling the users of LLVM libraries that need to use Bazel for whatever reason to have the best possible experience.
If anyone can recommend a file to modify I'd also be happy to test out incremental build times. Anyway, super glad to see google externalizing some of this stuff! I also have the metrics from this prometheus instance saved so if anyone wants to see any other metrics from node_exporter let me know!
fwiw bazel "felt" slower because the number of targets is an estimate and that kept going up while cmake was constant progress to "finish". I'd love to see how the bazel time looks with remote build execution enabled to see how much faster that goes.
I would be extremely surprised if Bazel were 4m faster than cmake+ninja with the same build config.
If this really is a correct result, I bet the ninja people would want to hear about it. (And, kudos to the Bazel folks!) But I think it's more likely measurement error. Maybe the build flags or host compiler aren't the same in the two configurations?
Yea, you can also see from the linked dashboards that it's writing way more stuff at the end of the build. I was watching the log just now and it looks like it's linking some things. I hope someone with more Bazel and LLVM knowledge can give this a go. I just wanted to toss something up as a "what would it be like if I just compiled this project" perspective.
This is absolutely not a definitive "X is better than Y" since X and Y are completely different things.
Another fun thing you can do is to do another clone of the llvm-bazel repo, build that using Bazel again. Things should've cached correctly and the rebuild from a totally different checkout should be pretty much instant, as long as your ~/.cache/bazel is still there.
I would double-check that it builds the exact same thing (there are lot of options when building LLVM, and CMake may auto-detect and enable support for things like Go bindings for example). Choice of compiler and compiler flags can make a difference as well.
The world needs a Bazel package management system so the Bazel community can collaborate on projects like this. Buckaroo [0] is the closest I have seen but it’s Buck focused and they have not rolled out Bazel support.
I think ultimately it needs a company like npm that can upstream the needed changes to Bazel itself so that it supports these workflows better. With the right focus you could hit cargo levels of ease of use, but for large multi-language projects.
It's in an open-to-comments state and the author of the design doc has been SUPER responsive and extremely amazing to talk to. He's been fully willing to put up with all of my stupid questions.
Github is already a bazel package management system though? If the package is a bazel workspace all you need to do is add a http_archive rule pointing to that github repo
That would work if, like golang, bazel was the "default" package manager for everyone. Right now it's not easy to get, for example, vulkan or muslc or qt as a bazel package.
It's also not easy to publish a version of your package (A) that depends on another package (B). This would create a diamond-problem like situation where your package (C) depends on both packages (A->B, C->A, C->B). So, some code needs to resolve these issues and reproducible identify the exact hashs of everything to pull in to make it a not-manual process.
Also, something great about the design docs linked in my other post: there's a presubmit.yaml standard so, pulling in a library, will include tests that bazel will run for whatever arch you're compiling for. For instance, say you pull in sqlite and need to build it for RISC-V. Before you just needed to hope that sqlite worked correctly on your arch, now you'll be able to test those situations in CI with RBE runners for all architectures.
> That would work if, like golang, bazel was the "default" package manager for everyone. Right now it's not easy to get, for example, vulkan or muslc or qt as a bazel package.
I agree, but I don't think a "Bazel management system" would solve this issue, because the problem is people buying into bazel in the first place
> It's also not easy to publish a version of your package (A) that depends on another package (B). This would create a diamond-problem like situation where your package (C) depends on both packages (A->B, C->A, C->B). So, some code needs to resolve these issues and reproducible identify the exact hashs of everything to pull in to make it a not-manual process.
This is a good point. However, I think realistically, effort would be better spent currently on making it easy to bazelize existing code. I have (unfortunately) never been able to pull an external library without manually bazelizing it, and this only actually ends up becoming a problem when bazel picks up enough momentum in OSS that you are likely to find a external library that is already bazelised.
How do the existing repos that do have this dependency structure solve the problem? For example there are loads of packages that depend individually on Abseil. If my package uses Abseil and it uses tcmalloc, it also uses Abseil by way of tcmalloc, but in practice this does not seem to cause trouble.
Each dependency appears as a "repository" so as a bazel target it will look like "@<thing>//some/target:file". Everything refers to a dep by it's workspace/repository name and exports a `repository.bzl` or `workspace.bzl` file that your WORKSPACE file `load()`s and calls a function in.
It does seem a bit high touch, but don't I also have the alternative of just cloning third party code into my repo and bazelizing it myself? I've certainly seen that done, and it's what Google does internally as well.
It's possible and what I've done quite a bit when using bazel but it makes code sharing very difficult. I think the internal desire, from google, is likely between tensorflow and cloud wanting to ship code easily to the OSS world. One of the reasons PyTorch is taking off is because people can build it easily!
Not every package (especially core system packages, like zlib/openssl/glibc/...) are on GitHub and want to pull in Bazel buildfiles into their source tree. As such, there's no guaranteed canonical-upstream-repo:buildfile-repo mapping, so you need some way to organize, keep track of what's where, and make sure things work well together.
While Bazel design seems not bad, could they have used a build system that doesn't require Java? It's especially important for bootstrapping. CMake requires only C++ at least.
Advocating for Java as a general purpose tool language sort of overlooks the incredibly broad and prevasive installbase of machines more than ten years old, as well as the rapidly growing installbase of ARM SOC machines.
For many, a tool eating up a gigabyte or more of memory is certainly not acceptable.
Not everyone has a powerful machine to do so with; many folks are still using computers that shipped with WinXP, or even Vista or 98/Me. It's a huge world outside of California.
Personally, my primary home/non-work machines are an RPi4 and a Pinebook Pro.
No, Bazel just fails to satisfy the very real need of building on less powerful machines because of its technological choice, a need served by other tools competing with it, and therefore is not really a truly general, one size fit all, solution.
The fact that you personally deem the needs of others illegitimate doesn't preclude them from actually existing.
Travel to Africa, India and South East Asia some time. The world is a big place outside of California, and the hardware available to many is not what you'll find at a cafe in San Francisco.
The RPi is an example of a growing SOC device market, but I did mention as well the enormous install base of older and under-powered devices in general.
I don't live in California (nor the US), I've only visited twice or so.
I'm from a place similar to those you're describing.
Few people would get enthusiast gear like Raspberry Pi. What they would get, instead, would be an old x86 PC with pirated Windows. A crummy knock off Chinese laptop or locally assembled PC (so not from the big OEMs).
Not quite. Even rather poor people in poor countries can get PCs with decent computing power, as PC performance plateaued circa 2010 and old PCs are really cheap. You don't need to get something 20-25 years old when something 5-10 years old is maybe 20% more expensive and a lot more powerful.
I'm curious what you consider "insane" to be -- and I mean this genuinely. Google has what I think most would consider to be a large monorepo, but also provides developer desktops with ample RAM. I've noticed blaze's startup time, but never really thought about its memory consumption. Where are you running into limits?
Are you using remote execution with Bazel?
If you have hour long builds, it's often worth investing in remote execution (just remote caching can help too).
I think open-source projects avoid Bazel not because of the Java issue. You can install it pretty easily and never realize that it's written in Java.
By analogy, I'd say that if Make is a hammer, Bazel is a CNC machine. Most DIYers are going to have hammers, and everyone understands how to use them, but CNC machines are becoming cheaper and more common.
The official .deb is 156MB compressed, which is prohibitive for all but the largest OSS projects. I assume much of that is the bundled JRE, but that only illustrates the point about Java.
Most tellingly, Google has gone through two major build system migrations (Android and Chrome), and neither project chose to migrate to Bazel/Blaze. If Google won't eat their own dogfood, it doesn't inspire much confidence.
> If Google won't eat their own dogfood, it doesn't inspire much confidence.
Chromium's gn started being prototyped in 2013 [1].
Android's soong started being developed in June 2015 [2].
Bazel's first open source release was in September 2015 [3].
In addition, you surely can't be serious about Google not 'dogfooding Blaze' - it's a critical build system internally at Google. And new external projects (like gVisor) are also now built using Bazel.
> The official .deb is 156MB compressed, which is prohibitive for all but the largest OSS projects.
I'm not sure what to make of this statement. Why is 156MB prohibitive? You don't need to include Bazel in the project any more than you need to include Xcode for the macOS version of a project. You can specify a version of Bazel with .bazelversion so you don't need to worry so much about people using the wrong Bazel.
There are some systems like Waf and Autotools where the build system is customarily bundled inside the source release, but this is not universal--if you use CMake, it's almost certainly not bundled either.
> Most tellingly, Google has gone through two major build system migrations (Android and Chrome), and neither project chose to migrate to Bazel/Blaze. If Google won't eat their own dogfood, it doesn't inspire much confidence.
Android is a fairly large project itself consisting of the Kernel (which has got its own custom build system) and a ton of different components. From what I understand, for NDK projects, Bazel is moving towards "the one sane way to do things", it will just take time, this stuff doesn't happen overnight. For non-NDK (pure Java or Kotlin) projects, there's not really a point.
Chromium is a bit of a special snowflake and predates Bazel's Windows support, and a codebase the size of Chromium would take a long time to migrate--but it looks like it's heading in that direction. From what I can see in the revisions to the Chromium build system, it seems that it's massive custom build system is moving towards Bazel by leaps and bounds. It's already structured like a Bazel project and uses much of the same terminology, I wouldn't be surprised if a few hundred Bazel scripts appear overnight in Chrome, because it looks like much of the groundwork has been done.
Most open-sourced Google projects use Bazel now. Of course, projects like Android and Chrome have been around for a long time and have invested in their build systems, so any change will take years.
Android and Chrome also both use git for version control. Take that how you will, but I don't think it's an indictment of Piper (https://research.google/pubs/pub45424/)
The speed and responsiveness is there, but the price is a tricky management of background daemon and aggressive memory prefetch and caching which may be a bit heavy on memory resources (which is then less available to the compiler processes). This works out well when you run the Bazel daemon locally and distribute the work to a remote build farm, but less when you're building a project like LLVM on your laptop.
To build Bazel from source in an environment where you don't want to use any Internet binaries, you need a Java built from source. Needing a Java is painful. You also need a Bazel to build Bazel.
And Bazel's build system itself IIRC wants to download stuff from the Internet on build - which can be pre-fetched, but distro maintainers still don't like having to deal with taht.
As far as implementation languages go, Java is mature, extremely stable, doesn't suffer from the quirks and foibles of older languages like C++, and is slower-moving than some of the newer languages. I know I've complained about running Java services but I think it is a great choice for writing a build system.
LLVM has the most complicated CMake setup I’ve ever seen. I’m sure there’s a reason for that, but I always loathe hacking around in the source for that reason. Finding which cpp file belongs to which CMake file is an annoying and often frustrating task. Their CMake scripts also make some assumptions on Windows, making it impossible to build LLVM with LLVM without editing CMake files (at least, last time I checked). To me, the whole value of CMake is to allow you to swap in different toolchains with ease.
So I wish someone smarter than me would go clean up the current CMake setup instead of using another build system altogether. But I guess any improvement to build usability is a good thing.
I think the purpose of this project isn't to change the LLVM build system, but rather to integrate LLVM better into Bazel. Bazel offers deterministic builds (which allows things like content-based caching and early stopping), but for that to work well, you often need to build the toolchain itself using Bazel as well—since the object files depend on the compiler too, not just the source files. Currently, the system is kind of broken: if you upgrade your compilers, Bazel doesn't catch the changes, and your build cache ends up potentially breaking as a result.
That said, I'm not sure why they don't just hash the compiler and library binaries instead of building them from scratch... but that's a different question.
Right, the Google monorepo is more monolithic than anyone can imagine. The compiler was part of the repo as well. I forgot how does the bootstrap work, but it definitely builds a compiler if needed (most often there is a cached version somewhere already, so no risk of small changes result into gigantic builds)
> Their CMake scripts also make some assumptions on Windows, making it impossible to build LLVM with LLVM without editing CMake files (at least, last time I checked).
As of recently (LLVM/Clang 11), its CMakeLists.txt still doesn't work with Clang as a compiler (and MSVC's headers). Partly because <atomic> doesn't compile in Clang's C++11 mode (which the .cmake files use when testing for atomics), and partly because it can't detect Clang's host/target (don't know) triple. I recall getting build-time failures as well as CMake setup problems.
It's also not possible to build the Go or OCaml bindings with MSYS2/MinGW because of some silly assumptions in the CMake file.[0] I have an open issue describing exactly what needs to change, as well as a patch for it, but haven't been super successful at getting any attention on the issue and the patch submission process is fairly time consuming for such a small change.[3]
>Their CMake scripts also make some assumptions on Windows, making it impossible to build LLVM with LLVM without editing CMake files (at least, last time I checked).
Any specific info on this? I build llvm and various associated projects, mostly on linux but on windows and macos sometimes, and don't remember issues of this nature.
I just kicked off a build of llvm and clang on windows using VS 2019 community and it finished just fine. I used the Ninja generator for speed but "Visual Studio 16 2019" also works. Those are the only two I do.
EDIT: ah, using llvm/clang to build llvm/clang. I have not done that, I use VS community. I'm not sure this is actually a CMake issue though.
Could be that it's harder to "clean up CMake setup" than implement Bazel setup. Google compiles everything from source, so this is probably just a repackaged variant of what they use internally in their monorepo.
The idea having subpackages where one subpackage is used to build another subpackage breaks CMake's (and autotools, and most things') brains.
I've been talking to the Meson devs about how it would be really nice to nail this in Meson.
Compiler's build systems are usually an absolute mess, and there's no reason it needs to be this way. As a distro maintainer, compiler's terrible build systems are in fact my #1 source of issues.
I was pleasantly surprised by Bazel. It seems to solve pretty much every concern I had, with not much downside. The biggest pain point was not being able to easily put the massive build files on an external hard drive, but you can do it with —disk_cache and —output_base. Unfortunately lots of scripts don’t expose a direct interface to the bazel command, making it necessary to modify those.
Bazel is like git... it is elegant in a lot of ways but it's also awful in a lot of ways. The nice thing about it is that, overall, it's just than the other stuff you find out there, and (I think) it's because it's very composable—a property many other build systems don't have. But it has lots of rough corners once you get your hands dirty.
You can put things like --disk_cache and --output_base into your ~/.bazelrc to globally apply it to everything without having to pipe it through a script.
I have never seriously used bazel. Just looking at it, I found the the thought of manually mirroring all #includes to the build configuration in a big project pretty aggravating.
Is that not really an issue in practice? Do people automate that away?
Google automates it away, to the extent that if you run our "rebuild if anything changes" tool it will automatically run the BUILD cleaning tool if any layering violations happen or if any dependencies are absent.
People use globs (wildcards) and group targets together to make it more manageable. It still does present some friction moving forward, but it's much more pleasant than listing every file individually, and there's a kind of an elegance to the system when you get used to it. And it does warn you if you include files that you didn't depend on. (Pro tip: never use include=...; always depend directly on a cc_library(hdrs=...).) That, said I do wish there was a way to make it easier and avoid the redundancy.
By “mirroring the includes” do you just mean listing the .h files in the build files? Don’t most build systems already have that problem for cpp files? Tbh I prefer also being explicit about the headers packaged in a library as well as the source.
So far, it seems to result in a more accurate Makefile (with includes), but my project is pretty small. The Erlang compiler can also output a dependency file, with pretty similar looking arguments, so that helps too.
> By “mirroring the includes” do you just mean listing the .h files in the build files?
It's not just listing them once, so they get installed or such. You need to list them as dependencies for objects that get built.
> I just found this tutorial to get the compiler to list your dependencies for you with (GNU) Make
Yea, that works very well. A lot of projects sticking with make use that via some makefile hackery (e.g. postgres). Cmake etc do so automatically as well(at least for the common generators).
But with bazel you have to basically maintain that manually on a granular basis. So if you add a new #include to some random .c file you often also need to add that as a dependency. There's a bunch of error checking around that to make it easier to get it right, but it's still seems like a lot of work to me.
Would this allow people to host out-of-tree LLVM-based project easier? Last time I checked (shamefully, probably 2 or 3 years ago), LLVM still heavily prefer in-tree projects (creating a sub-directory and CMakeList.txt file) for source-code level distribution. If I can `deps = ["@llvm-project//:xxx"]` to start write new passes for llvm bitcode, it would be very exciting.
I'm not sure exactly what you mean by synchronicity, but that's basically how e.g. package-lock.json works... it's the direction that everything is moving in, and Bazel is a target because it's imposing this on new frontiers rather than just being a new language where "things work that way".
I think both sides want to erase that difference, and I commend them for it.
> I'm not sure exactly what you mean by synchronicity, but that's basically how e.g. package-lock.json works... it's the direction that everything is moving in, and Bazel is a target because it's imposing this on new frontiers rather than just being a new language where "things work that way".
package-lock.json stuff is fine, and Nix has long required that everything be locked.
That is the good synchronicity: it's OK if upstream moves faster, downstream catches up and just pins whatever they are using. You can see how upgrade waves cascade downstream, this is because the sychronizing is local.
The internal good model is everything just lives in one repro and must update all at once. This is global synchronicity.
----
It's a lot like 1 giant lock vs Multiversion concurrency control. The MVCC aspect is that there may be a few versions in flight as things cascade down, but old versions should be retired (like committed transactions) as new ones come.
The consistency is basically the pinning; nevermind what other transactions/development is going one elsewhere, each package sees a consistent view as structured by the pins.
If you go listen to Google people talk about dependency/version management. It's like they think it's global lock vs inconsistency. Monorepo or chaos. This false choice irks me.
I guess their dev tools people should talk to their DB people?
> If you go listen to Google people talk about dependency/version management. It's like they think it's global lock vs inconsistency. Monorepo or chaos. This false choice irks me.
I’d say this is a much more nuanced look at dependencies & versioning. We all know these days that something like semantic versioning is not enough, because it is too easy to depend on behavior which the developers of your dependencies did not expect you to depend on. So you need something more powerful than just having a bunch of developers get together and agree not to break each others’ code.
Bazel is a part of that solution—if you can analyze and understand the dependencies in a project, and do so in a very automatic and reliable way, you can provide people with tools tools for updating downstream code when upstream code changes.
I think people are focusing too much on Bazel as only a build system and not as part of a platform for building developer tooling, and people are focusing too much on Bazel’s original incarnation as a build system for a massive monorepo.
I think there’s also a reality that we must deal with that is—as software gets larger and more complex, and as we have more dependencies, relying on version pinning for stability is losing ground as a technique for making our software more reliable. We have many more dependencies in our software now, so just subscribing to a mail feed of all of the version bumps in the libraries you use would be enough to drown you, and the longer you stay on older versions the more painful it can be to upgrade.
My bad experiences have been trying to package bazel and bazel-built things in Nixpkgs.
Bazel and Nix both want to control everything...and it gets ugly. Nix is pretty wall layered, and while it has plenty of UX problems, it's quite easy to predict what it will do.
Bazel has the classic problem of new well-funded software of being more polished than architected, and so feels like working with an amorphous blob where all that polish turns on you when you do something weird and now it oozes unforeseen complexity and interactions.
I’m interested in the specific issues you hit, not just additional metaphors. Bazel, while there’s a learning curve, exists to provide hermetic builds, so I’m surprised that “what it will do” is not predictable.
I ran into an issue where bazel broke things pretty badly. Turns out that bazel does PID namespacing, and the build scripts assumed that no two programs with the same PID would share TMPDIR which caused some rather odd behaviors.
What does it mean by "bazel" wants to control everything? AFAIK, it simply needs ways to get to the source of another package - whether it's http_archive, or who knows what, and how to build it.
A bunch of comments seem to be comparing and contrasting CMake+Ninja vs. Bazel. Just my two cents, but I think that's missing the real point of this work.
It isn't about whether Bazel is a better build system for LLVM. At least, that isn't my motivation.
There are users of LLVM's libraries that use Bazel. Whether for good or bad reasons, it is extremely useful to enable them to use LLVM's libraries with a fully native Bazel build. That is my motivation: enabling the users of LLVM libraries that need to use Bazel for whatever reason to have the best possible experience.