Code's Worst Enemy [2007]

wccrawford · on July 5, 2011

Read all of it before I looked at the date. (The date was added to the title just before I posted this.) I thought it very interesting that Clojure and Scala feature very prominently in the comments, and yet I just heard about them a few months ago. I feel like I've had my head stuck in the ground.

As for codebase size... Yes, Java is going to have more lines due to the nature of the language. He noted that experienced Java programmers don't see all that excess any more... He claims that's a bad thing, but I think that's what makes the language tolerable for them. They just look over them.

I also have to wonder how Mirah would stack up for him. I had high hopes for the language, but it uses the built-in Java types so all that verbosity is still there for anything that uses them. It doesn't really have all the syntactic sugar that Ruby does.

PaulHoule · on July 5, 2011

It's an important part of the Java problem that Java programmers aren't allowed to admit that verbosity is a vice and concision is a virtue. It goes hand and hand with a respect for "architecture" that too often can't say that the Emperor often has no clothes, and that a thick encrustation of barnacles often masquerades as an "architecture".

Just to take an example, I wasted a few hours of my life trying to change the number of worker threads on a certain server written in Java. In a normal language with a healthy culture I should have been able to write

server.setWorkerThreads(16);

and I'd be done. But no, this is the land of the architecture astronauts, and if I want to set the worker threads I have to go find the undocumented classes so I can set the BlockingQueue and the ThreadPoolFactory and the fifteen other parameters that I don't care about (and will either set them in a way that is incorrect or suboptimal -- unless I spent two days researching each and every one of those parameters, checking the source code and doing experiments)

Although the Java language is to blame for starting this fire, the Java culture perpetuates it. When you complain about this sort of thing, people don't listen, they just say you're a bad developer because you can't see the Emperor's clothes.

jbooth · on July 5, 2011

Actually, a setter on an instantiated server object would prob be a bad idea there, unless you're proposing logic to scale up/down the number of threads dynamically. You probably want a constructor parameter.

PaulHoule · on July 5, 2011

the trouble with the constructor parameter is you need one for each and every parameter. this might be a good answer if the language supports named parameters for the constructor, but Java doesn't.

In this case, there's a ThreadPoolConfig object which you can set these things on. It turns out you can pass this to the constructor^h^h^h^h^h^h^h^h^h^h^h^h factory method AND you can pass it to a reconfigure() method on the ThreadPool -- so the (normal) static case and the (slightly odd) dynamic case are both covered.

It wouldn't be so bad if there were reasonable defaults for the ThreadPoolConfig, but there aren't. If you take the defaults, the server complains that you've got the wrong ThreadFactory when you try to start it.

In the end I was able to solve the problem by getting a copy of the thread pool configuration of the server (having to do two casts), changing the thread count, and then reconfiguring.

The final answer wasn't that bad, but it wasn't that good either. And that's the trouble w/ the Java culture. The basic design is tolerable, but it's complex, which makes it hard to understand and hard to implement correctly. A little bit of attention to (i) sensible defaults, and (ii) documentation would go a long way.

Given a fixed budget, complexity is the enemy of quality, since it creates more surface area for barnacles to accumulate.

jbooth · on July 5, 2011

Well, it does sound like a bad design, because they should just be using an ExecutorService.. better implementation and industry-standard interface. Plus you could just do new Server(Executors.newCachingThreadPool(10)).

Smart job skipping past my initial point to the deeper point, that constructor values would just accumulate, which creates a need for the builder pattern in the absence of named parameters (which would be sweet.. java 7 supposedly will give us map literals which can substitute).

PaulHoule · on July 5, 2011

It ~is~ an ExecutorService, but it's funky because the Thread(s) that this system needs are special subclasses of Thread... Which in turn means you need a special ThreadFactory and such.

Which of course points out more ugliness in Java; if there was a better generics implementation we could have the type system enforcing the correct choice of a ThreadFactory.

jbooth · on July 5, 2011

Eh, it's an ugliness in this particular program IMO. You should not be subclassing Thread, pretty much ever, and every competent Java programmer will tell you so.

That's like me writing Perl or Python code that uses globals for everything and then complaining that they're unmaintainable languages.

Out of curiosity, which system are we talking about here? Jboss or something? Java protip, avoid any library involving the letters "EE".

PaulHoule · on July 5, 2011

Grizzly, a part of Glassfish. Officially sponsored by Sun^h^h^h^h Oracle.

jbooth · on July 5, 2011

Yeah, there's your problem right there :)

Never even looked at Glassfish but it was pretty much panned from the moment it came out. Sorry you have to work with it. Recommend Jetty or Netty if you want an embedded server to work with.

JoelMcCracken · on July 5, 2011

I have read this essay a few times, and I think that steve is missing an insight into modularity. We should be writing libraries for ourselves, not just piling code on top of other code. As an argument, I first offer a few propositions.

Any code upon which your project relies can be considered "part of the code base". As an example from Rails, it has become a somewhat common/accepted practice to notice a deficiency in a gem, fork it on github, fix it for your needs, and use it. This is especially common if your project depends on abandoned gems that you still need to use, for whatever reason. Even if this isn't a problem, I often need to look into internals to figure out where something is breaking.

Then, if code that your code depends on is considered part of the project, if steve was correct, then it would be very hard to use anything more advanced.

However, practical wisdom shows that higher level such as Ruby are more productive than lower level languages, such as C. Thus, the original premise is shown false.

If the original premise is false, then what is it that steve is noticing? I think it is about leaky abstractions. If you write libraries for yourself, with good, clean interfaces, you do not need to "notice" the lines of code from other parts of the project.

tl;dr It's about creating abstractions for yourself, not pure LOC count.

seles · on July 5, 2011

I realize it is in bad taste to comment before reading the whole article (only 1/3 through yet).

But I cannot help but see the irony in taking so many words to say that size is bad for programs.

Conradaroma · on July 5, 2011

great comment

helmut_hed · on July 5, 2011

I was disappointed to see this line:

design patterns – at least most of the patterns in the "Gang of Four" book – make code bases get bigger.

This hasn't been my experience. In fact, learning about and applying design patterns has been a crucial part of reducing code size in previous projects by eliminating copy-paste code. For example, a previous company had a lot of code devoted to traversing a particular data structure and taking various actions at each node. There were multiple variants of this traversal code, each with different actions but also slightly different (for no reason) in the traversal code. Applying the Visitor pattern allowed me to write the traversal only once, reducing code duplication and potential future bugs. I also believe it was more readable.

This was in C++ though - I'm not sure how much of his comments are directed solely at Java.

gmartres · on July 5, 2011

If you had experience in a functional programming language, you'd have recognized this as a fold and written the foldMyStructure function. If your Visitor pattern solution is similar to the example code in https://secure.wikimedia.org/wikipedia/en/wiki/Visitor_patte... then it is certainly longer than the natural solution in functional programming.

Auguste · on July 5, 2011

I just realised that the game that Steve is referring to in his blog is Wyvern. It was one of the earliest multiplayer games I played. I wouldn't mind playing it again, but it seems I can't reach the website (www.cabochon.com) right now.

sdfjkl · on July 5, 2011

Game info (Wyvern): http://en.wikipedia.org/wiki/Wyvern_(video_game)

scythe · on July 5, 2011

Saying that a codebase needs to be small is like saying an airplane needs to be light: yes, it's true, in fact it's plain-and-simple obvious, but it doesn't reveal any of the hidden secrets and planning and so forth that are basically necessary to keep a codebase small and manageable. Even high-powered polymorphic languages a la Lisp will happily hand you enough rope to drown yourself in.

If you look at code as basically a description of what a program does, like, a formal description, it's unsurprising that conciseness and clarity are not entirely unrelated and that redundancy is annoying for anyone who has to read it. What's not obvious is the most clear or concise way to describe what the program does, and this is just as true in a computer language as in say a natural language.

jbooth · on July 5, 2011

Yeah, and if I went through the codebase for my current project and added a bunch of line breaks, I'd double the code "size" according to the LOC metric without changing anything about the complexity at all.

Macros kinda terrify me because they mean I can't glance at a random 5-line fragment and know exactly what's happening. I'm not convinced that they add to clarity, even while doing a lot for conciseness. To be fair, I've never used them on a project that was bigger than "tiny", so maybe I'm just casting aspersions on something I don't understand.

After macros, then what? Is the slightly redundant Foo foo = foos.next() in Java really hurting clarity that bad? I mean, yeah, the type should be inferred there but is that the reason big projects get unmanageable? Or is it higher-level architectural bloat?

EDIT: Wow. Any downvoters want to explain why this comment is so offensive to them? Perhaps they could contribute their own thoughts about complexity and code bloat?

bad_user · on July 5, 2011

     Macros kinda terrify me because they mean I can't 
     glance at a random 5-line fragment and know exactly
     what's happening

Higher-level languages like Java running on top of advanced VMs like the Hotspot JVM -- are high-level enough such that you NEVER know what's happening.

Even dropping to JVM bytecode doesn't help you much, as that doesn't show you how data moves in and out of the registers, how decisions on compiling hotspots / inlining functions are made, or how allocated objects move through GC-generations. You also lose control a lot. When you drop down to assembly, tail-calls are trivial for instance, not to mention dealing with cache pressure becomes a lot simpler.

What high-level languages give you (versus assembly) is a way to express intent, rather than details that you don't care about in general.

Which is no different from having macros in your language, or other high-level abstractions, in which you invent your own language for expressing intent, rather than deal with useless details that you don't care about 99% of the time.

    Foo foo = foos.next()

But that line isn't really clear. It's saying stuff like - give me a Foo right now. But what happens when there are no more Foos to be given? Does it return a Null? Does it trigger an exception? That line of code also doesn't express intent. So you've got a Foo. Now what? And have I mentioned that objects in "foos" may not be Foo instances?

Basically that whole line of code is totally worthless boilerplate.

jbooth · on July 5, 2011

On your first point, I meant logically what's happening, not the exact memory and heap layout. I'm pretty sure you can take that for granted in any discussion about programming ever unless the person says "in regards to the exact stack and heap layout".

You don't know the heap layout in C either, at least not without some extreme measures like reading the values of pointers, and you don't usually care.

You can't just say "oh this is sort of like this so we'll just collapse all the differences". Macros change your program logically (as in not just perf improvements) prior to compilation. Not-macros do not. That's a big difference.

That line is crystal clear (as shorthand) to a Java developer because "foos" is most likely an Iterator<Foo> which will throw an exception when there are no more foos. In an actual program it will be very explicit that it's an Iterator<Foo>. That's a context thing that's specific to Java where all Iterators behave the same way. The difference between the standard java.util stuff and macros is that each shop or team has their own macros and in my limited experience with macros, it is not always easy to tell where they're in effect.

And, the line isn't boilerplate. We're creating a reference to a Foo and assigning a value to the destination of that reference. Presumably, we're going to do some stuff with it later. Otherwise, yes, we would not be bothering to pop one out of the iterator. Is that clear now? What actual point are you trying to prove in this comment? My comment was sort of a request for actual constructive solutions. You've provided non-sequitors.

gloob · on July 5, 2011

If you haven't seen Trusting Trust already, you should read it: http://cm.bell-labs.com/who/ken/trust.html . It's short and smart.

The brief summary is, of course, you can almost never really know "logically what's happening", even in nice reliable languages like Java or Python or Haskell or Ruby or C.[0]

The easy solution (and the one that every working programmer makes every day) is to say: well, let's assume that the compiler is non-malicious and has relatively few bugs that will impact me. That's a fair response.

Next we run into the fact that just because you call a function, and that function happens to be called add(x,y), doesn't actually mean that the function will return the sum of x and y. In the absence of further evidence, it could be an HTTP server or fill your hard drive with infinite copies of Das Kapital or anything at all. The normal solutions proposed to this are (1) read the documentation, or (2) read the code. Those are both workable solutions.

Then macros come into the picture. I fail to see how the already-heavily-used techniques of (1) reading documentation and (2) reading the implementation of the macro, which seem to be at least tolerable and mostly workable with functions suddenly fall apart with macros.

[0] C and C++ have the preprocessor and macros anyway, which "change your program logically...prior to compilation", so I won't mention them further in this comment.

jbooth · on July 5, 2011

Yeah, my distrust for macros is mostly informed by my experience with C/C++ preprocessors, which have led to really annoying facepalms in my personal history.

But, as I said, I haven't actually used macros for a large project so maybe they're more workable than I give them credit for.

bad_user · on July 5, 2011

When I think of macros, I'm generally thinking about getting a syntax tree at runtime and doing transformations on it.

The closest thing related to this in a mainstream language is LINQ in .NET. When passing a lambda expression, you can either get a method reference that you can execute, or you can get a syntax tree of that expression.

It is effectively used in Linq to Sql to query databases using the same syntax you would use for querying in-memory collections of objects. You can also teach it to do pretty sweat things, like doing data transformations using the GPU. Check out this project for instance: http://brahma.ananthonline.net/

Also, macros are not necessarily executed at compile-time as in C/C++ - in Lisp a macro has no boundaries to when or how it gets executed. A macro is just a piece of code that can call normal functions and that returns a piece of code that will then get executed.

One common use-case is for achieving lazy evaluation in a language that isn't lazy and so most macros are trivial to understand. But you can have other use-cases, like in the above example with Linq, where the piece of code getting generated depends on runtime conditions.

I do agree that macros can make code harder to understand, but then again, I've seen horribly over-engineered, poorly implemented and unreadable pieces of Java code and so I think it all comes down to having a culture for code readability than about having a language that forces you to do things in a certain way.

pnathan · on July 5, 2011

Yes, but that (specific) example is a great example of a waste of space in the vast, vast, vast majority of cases. Let me show you examples of code that hides that whole iterator groadyness:

Python

for foo in Foos: baz(foo)

Common Lisp:

(loop for foo in Foos do (baz foo))

Perl:

foreach my $foo (@Foos) { baz(foo); }

C++ has a foreach construct as well, but it's a bit shoehorned. I'm relatively certain C# has some shorthand here for the same thing, and IIRC, it flows nicely.

There is no "Most likely", no "exception", it's very straightforward.

Rarely - very rarely - I care about the mechanisms of initialization, nexting, and termination. The rest of the time, (loop for var in list do) gets me by just fine.

* Macros change your program logically (as in not just perf improvements) prior to compilation. Not-macros do not. That's a big difference.*

It's no different than hunting through framework documentation.

jbooth · on July 5, 2011

Uh. Java has a foreach operator. Has had it for almost a decade now. Almost the same as the C# operator. You might want to consider dialing down your self-assuredness if you don't know the basic commands in the language being discussed.

And macros are vastly different from hunting through framework documentation, that's not the same ballpark or even the same sport. If you want macro-like behavior in Java, you can always break out AspectJ, which I strongly recommend against doing due to my biases against code that changes in between me typing it and it running.

pnathan · on July 5, 2011

No, it's exactly the same. It's reading the things you are using to understand what they do.

peteretep · on July 9, 2011

In Perl you'd be more likely to write:

bar($_) for @Foos;

PaulHoule · on July 5, 2011

In my mind excessive line breaks ARE a problem.

A key skill in programming is holding a large number of concepts together at once and thinking about them together. It's like holding up a big stack of bowls, plates, silverware, glasses and other stuff.

Large screen area makes it possible to scan this stuff with your eyes, which increases your effective working memory.

If your code takes up 3x the space, you can fit less on your screen, so your effective working memory is smaller. Practically, it makes you less intelligent.

bronson · on July 5, 2011

Care to point out a serious project with a line break problem?

jimbokun · on July 5, 2011

The "braces on separate lines" style contributes to a line break problem, in my opinion. If you use this style and have nice, concise methods and/or functions, the extra lines added for each start brace can take up a noticeable amount of your code window's real estate.

There are other code style choices that increase line count which I dislike for the same reason, this is just the first one that came to mind.

radu_floricica · on July 5, 2011

You don't really want to know everything that's going on. At some point, and over a certain project size, you just have to trust some stuff and use higher abstractions when reading/writing code. Macros are one of the tools you have for doing this in an orderly, not-want-to-kill-yourself way.

So yeah, they hide stuff, but when you have no choice but to hide stuff they can do it quite well.