Hacker Newsnew | past | comments | ask | show | jobs | submit | lukev's commentslogin

I like this framing, but it does seem to imply that a whole dev shop, or a whole product, can or should be built at the same level.

The fact is, I think the art of building well with AI (and I'm not saying it's easy) is to have a heterogenously vibe-coded app.

For example, in the app I'm working on now, certain algorithmically novel parts are level 0 (I started at level 1, but this was a tremendously difficult problem and the AI actually introduced more confusion than it provided ideas.)

And other parts of the app (mostly the UI in this case) are level 7. And most of the middleware (state management, data model) is somewhere in between.

Identifying the appropriate level for a given part of the codebase is IMO the whole game.


100% agree. Velocity at level 8 or even 7 is a whole order of magnitude faster than even level 5. Like you said, identifying the core and letting everything else move fast is most of the game. The other part is finding ways to up the level at which you’re building the core, which is a harder problem.

Disagree, I don't particularly want to up the level at which I'm building the core. Core is where I want to prioritize quality over speed, and (at least with today's models) what I build by hand is much, much higher quality.

I think that's too easy an analogy, though.

Calculators are deterministically correct given the right input. It does not require expert judgement on whether an answer they gave is reasonable or not.

As someone who uses LLMs all day for coding, and who regularly bumps against the boundaries of what they're capable of, that's very much not the case. The only reason I can use them effectively is because I know what good software looks like and when to drop down to more explicit instructions.


> Calculators are deterministically correct

Calculators are deterministic, but they are not necessarily correct. Consider 32-bit integer arithmetic:

  30000000 * 1000 / 1000
  30000000 / 1000 * 1000
Mathematically, they are identical. Computationally, the results are deterministic. On the other hand, the computer will produce different results. There are many other cases where the expected result is different from what a computer calculates.

A good calculator will however do this correctly (as in: the way anyone would expect). Small cheap calculators revert to confusing syntax, but if you pay $30 for a decent handheld calculator or use something decent like wolframalpha on your phone/laptop/desktop you won't run into precision issues for reasonable numbers.

He’s not talking about order of operations, he’s talking about floating point error, which will accumulate in different ways in each case, because floating point is an imperfect representation of real numbers

Yeap, the specific example wasn't important. I choose an example involving the order of operations and an integer overflow simply because it would be easy to discuss. (I have been out of the field for nearly 20 years now.) Your example of floating point errors is another. I also encountered artifacts from approximations for transcendental functions.

Choosing a "better" language was not always an option, at least at the time. I was working with grad students who were managing huge datasets, sometimes for large simulations and sometimes from large surveys. They were using C. Some of the faculty may have used Fortran. C exposes you the vulgarities of the hardware, and I'm fairly certain Fortran does as well. They weren't going to use a calculator for those tasks, nor an interpreted language. Even if they wanted to choose another language, the choice of languages was limited by the machines they used. I've long since forgotten what the high performance cluster was running, but it wasn't Linux and it wasn't on Intel. They may have been able to license something like Mathematica for it, but that wasn't the type of computation they were doing.


I didn't consider it an order of operations issue. Order of operations doesn't matter in the above example unless you have bad precision. What I was trying to say is that good calculators have plenty of precision.

But floating point error manifest in different ways. Most people only care about 2 to 4 decimals which even the cheapest calculators can do well for a good amount of consecutive of usual computations. Anyone who cares about better precision will choose a better calculator. So floating point error is remediable.

Good languages with proper number towers will deal with both cases in equal terms.

Determinism just means you don't have to use statistics to approach the right answer. It's not some silver bullet that magically makes things understandable and it's not true that if it's missing from a system you can't possibly understand it.

That's not what I mean.

If I use a calculator to find a logarithm, and I know what a logarithm is, then the answer the calculator gives me is perfectly useful and 100% substitutable for what I would have found if I'd calculated the logarithm myself.

If I use Claude to "build a login page", it will definitely build me a login page. But there's a very real chance that what it generated contains a security issue. If I'm an experienced engineer I can take a quick look and validate whether it does or whether it doesn't, but if I'm not, I've introduced real risk to my application.


Those two tasks are just very different. In one world you have provided a complete specification, such as 1 + 1, for which the calculator responds with some answer and both you and the machine have a decidable procedure for judging answers. In another world you have engaged in a declaration for which the are many right and wrong answers, and thus even the boundaries of error are in question.

It's equivalent to asking your friend to pick you up, and they arrive in a big vs small car. Maybe you needed a big car because you were going to move furniture, or maybe you don't care, oops either way.


Yes. That is the point I was making.

Calculators provide a deterministic solution to a well-defined task. LLMs don't.


Furthermore, it is possible to build a precise mathematical formula to produce a desired solution

It is not possible to be nearly as precise when describing a desired solution to an LLM, because natural languages are simply not capable of that level of precision... Which is the entire reason coding languages exist in the first place


> Calculators are deterministically correct given the right input. It does not require expert judgement on whether an answer they gave is reasonable or not.

That's not actually true. The HP-12C calculator is still the dominant calculator in business schools 45 years later precisely because it did take expert judgement to determine whether certain interest and amortization calculations were reasonable.


So, isn't this a rather longwinded way to say that a signature only extends to the scope of the message it contains?

It doesn't matter if I sign the word "yes", if you don't know what question is being asked. The signature needs to included the necessary context for the signature to be meaningful.

Lots of ways of doing that, and you definitely need to be thoughtful about redundant data and storage overhead, but the concept isn't tricky.


Hi, post author here. Agree that the idea isn't tricky, but it seems like many systems still get it wrong, and there wasn't an available system that had all the necessary features. I've tried many of them over the years -- XDR, JSON, Msgpack, Protobufs. When I sat down to write FOKS using protobufs, I found myself writing down "Context Strings" in a separate text file. There was no place for them to go in the IDL. I had worked on other systems where the same strategy was employed. I got to thinking, whenever you need to write down important program details in something that isn't compiled into the program (in this case, the list of "context strings"), you are inviting potentially serious bugs due to the code and documentation drifting apart, and it means the libraries or tools are inadequate.

I think this system is nice because it gives you compile-time guarantees that you can't sign without a domain separator, and you can't reuse a domain separator by accident. Also, I like the idea of generating these things randomly, since it's faster and scales better than any other alternative I could think of. And it even scales into some world where lots of different projects are using this system and sharing the same private keys (not a very likely world, I grant you).


Also, the defining feature of capitalism is that it encloses what was previously common.

Land used not to be owned (feudal lordship was functionally different than private ownership.) Then, society shifted, land became private, and that was the beginning of rent. This is enclosure.

The whole concept of IP is to explicitly extend this process to ideas -- they are not free, they are owned, and I have to pay you to use them. This is also enclosure, precisely.



I could not possibly enumerate all the possible things that have been enclosed. Human beings obviously being the most morally egregious.


Rent is charging money for access to an asset or property.

The P in IP is Property.


The "rent" in "rent-seeking" does not refer to "rent" it refers to "economic rent."

Totally different concept. But don't take my word for it:

> "Rent-seeking" is an attempt to obtain economic rent (i.e., the portion of income paid to a factor of production in excess of what is needed to keep it employed in its current use) by manipulating the social or political environment in which economic activities occur, rather than by creating new wealth.[0]

> In economics, economic rent is any payment to the owner of a factor of production in excess of the costs needed to bring that factor into production. [1]

[0] https://en.wikipedia.org/wiki/Rent-seeking

[1] https://en.wikipedia.org/wiki/Economic_rent


I'm not sure how this relates to AGI.

This measures the ability of a LLM to succeed in a certain class of games. Sure, that could be a valuable metric on how powerful (or even generally powerful) a LLM is.

Humans may or may not be good at the same class of games.

We know there exists a class of games (including most human games like checkers/chess/go) that computers (not LLMs!) already vastly outpace humans.

So the argument for whether a LLM is "AGI" or not should not be whether a LLM does well on any given class of games, but whether that class of games is representative of "AGI" (however you define that.)

Seems unlikely that this set of games is a definition meaningful for any practical, philosophical or business application?


It's to do with how the creators of ARC-AGI defined intelligence. Chollet has said he thinks intelligence is how well you can operate in situations you have not encountered before. ARC-AGI measures how well LLMs operate in those exact situations.


To an extent, yes. Interdependent variables discovery and then hopefully systems modeling and navigating through such a system. If that's the case, then this is a simplistic version of it. How long until tests will involve playing a modern Zelda with quests and sidequests?


"AGI" is a marketing term, and benchmarks like this only serve to promote relative performance improvements of "AI" tools. It doesn't mean that performance in common tasks actually improves, let alone that achieving 100% in this benchmark means that we've reached "AGI".

So there is a business application, but no practical or philosophical one.


This is bad in tech. But at least we are (relatively) well equipped to deal with it.

My partner teaches at a small college. These people are absolutely lost, with administration totally sold on the idea that "AI is the future" while lacking any kind of coherent theory about how to apply it to pedagogy.

Administrators are typically uncritically buying into the hype, professors are a mix of compliant and (understandably) completely belligerent to the idea.

Students are being told conflicting information -- in one class that "ChatGPT is cheating" and in the very next class that using AI is mandatory for a good grade.

Its an absolute disaster.


My only gripe is how myopic the AI discussion on HN is. We barely talk about how it hits everyone else.

In the relocation industry, it's losing translators, relocation consultants and immigration lawyers a lot of work. Their cases are also getting tougher because people are getting false information from ChatGPT and arguing with them.

This problem is compounded by the lack of training data for that topic. I spent years surfacing that sort of information and putting it online, but with AI overviews killing the economics of running a website, it feels pointless.

I see such stories everywhere. People being replaced by something half as good but a tenth of the cost. It's putting everyone out of work and making everything worse.


Half as good and tenth of the cost is a good replacement. I constantly make those choices myself when purchasing things.


That entirely depends on what you are buying. If you’re in need of a lawyer to keep you out of the bottom bunk, I’d happily spend a lot more for a little better.


It's fine to an extent, but it kills what happens in the other half.

You can feel it with AI-generated content and responses, in AI-generated art, customer service bots and vibe-coded software. This gradual worsening of everything won't lead to lower prices or a better experience, so it's not really a tradeoff.


I think GPs point is that you'll soon not even be able to make those choices.

Now every toilet on the market only flushes number one. But hey, they're so much cheaper.


I've been telling my curious/adrift relatives that it's a machine takes a document and guesses what "usually" comes next based on other documents. You're not "chatting with it" as much as helping it construct a chat document.

The closer they can map their real problems to make-document-bigger, the better their results will be.

Alas, that alignment is nearly 100% when it comes to academic cheating.


The wild part is they’re having this reaction while using the most rigid and limited interfaces to the LLMs. Imagine when the capabilities of coding agents surface up to these professions. It’s already starting to happen with Claude Cowork. I swear if I see another presentation with that default theme…


This. As annoying as all sorts of 'safety features' are, the sheer amount of effort that goes into further restricting that on the corporate wrapper side side makes llm nigh unusable. How can those kids even begin to get the idea of what it can do, when it seems like its severely locked down.


Could you provide an example of such a thing that is prevented?


Sure. In the instance I am aware of, SQL ( and xml and few others )files are explicitly verbotten, but you can upload them as text and reference them that way; references to personal information like DOB immediately stops the inference with no clear error as to why, but referencing the same info any other way allows it go on.

It is all small things, but none of those small things are captured anywhere so whoever is on the other end has to 'discover' through trial and error.


By my understanding, the administrators at small colleges are among the least capable professionals one might find anywhere in the economy.


A friend and I have a contract with a local university here in Canada.

They paid for custom on prem software and in over a year, they have not fully provided both access and infrastructure for install it.

We have been paid already, but they paid for a tool they can’t get their shit together enough to let us install.


This is true even at large colleges. Better cut faculty jobs to deal with budget shortfall. Never mind the football program can raise $200m with a dozen phone calls.


> These people are absolutely lost, with administration totally sold on the idea that "AI is the future" ...

Doesn't sound that different from my tech job


When industrialization was taking root yes indeed the factory jobs sucked AND it was the future. Two things can be true


You left out the part that the non-factory jobs sucked more (or were just non-existent).

This is the opposite.


This is really interesting. I've been out of education for a long time, but I was wondering how they were dealing with the advent of AI. Are exams still a thing? Do people do coursework now that you can spew out competent sounding stuff in seconds?


I teach CS at a university in Spain. Most people here are in denial. It is obvious to me that we need to go back to grading based on in-person exams, but in our last university reform (which tried to copy the US/UK in many aspects) there was so much political posturing and indoctrination about exams being evil and coursework having to take the fore that now most people just can't admit the truth before their own eyes. And for those of us that do admit it, we have a limited range of maneuver because grading coursework is often a requirement that emanates from above and we can't fundamentally change it.

So in most courses nothing has changed in the way we grade. Suddenly coursework grades have gone up sharply. Anyone with working neurons know why, but in the best case, nothing of consequence is done. In the worst case (fortunately uncommon), there are people trusting snake oil detectors and probably unfairly failing some students. Oh, and I forgot: there are also some people who are increasing the difficulty of the coursework in line with LLMs. Which I guess more or less makes sense... Except that if a student wants to learn without using them, then they suddenly will find assignments to be out of their league.

So yeah, it's a mess.


> Except that if a student wants to learn without using them

My son, who is a freshman at a major university in NYC, when he said to his freshman English professor that he wanted to write his papers without using AI, was told that this was "too advanced for a freshman English class" and that using AI was a requirement.


Now colleges will have to try and detect if you didn't use AI!


I don't understand what they think it is they're teaching? Will we teach kids to "read" by taking a photo of their bedtime story and hitting a button next?


I'm afraid you're decades late for something similar: https://en.wikipedia.org/wiki/Whole_language

One of the teaching methods is "look at the context, like pictures, and guess what the word is". One example I remember was thinking "pony" is "horse" due to association without being able to sound it out.


Meh, today I opened twenty PRs and felt great. That's worth it to me. (/s)

https://twentyprsaday.github.io/


That's right -- the best way to succeed within a system is to hustle as hard as you can, and definitely don't stop to question the system itself.


Better than those who just want to burn the system down with no real plan for what comes next, and unable to comprehend the inevitable bloodshed of the 'glorious revolution' that they crave.


You think you are describing the Bolsheviks, but your description is equally fitting for those who want to abolish human labor without providing people alternative ways to make a living.

And no, hand waving about "UBI" doesn't count unless they start actually doing the politics required to implement UBI.


There's a lot of bloodshed going on under the status quo. Why do you think people are 'unable to comprehend' it? Maybe they just want to reallocate it and aren't especially sympathetic to those who who have avoided it up to now.


Do you comprehend the scale of the inevitable bloodshed that maintaining the status quo is bound to lead to? You don't do so any better than those you're chastising.


Most of them fried their brains with stimulants long ago. Thankfully for them, they no longer have to think. An LLM does it for them.

But it’s just the same idiots were rabidly cheering the latest JavaScript framework a decade ago, NFT’s and all manors or ridiculous things anyone with 2 working brain cells saw transparently though.


Not sure if you're being sarcastic or not, but I think this is actually good advice. It's great to be a free-thinker and question things, but I do think there is some (monetary) value in just not asking too many questions, but optimizing to be the best at whatever you're doing.

Edit: to give an example, I probably would have done better in school had I spent less time questioning the education system and more time just accepting it and trying to get good grades.


Yeah, succeed in the system, fuck everybody else. If the system is making the world a worse place, all the better, you can take advantage since you’re in the system. All that until you find yourself spat out by the system and get to experience what you’ve been part of with no recourse.


Your interpretation of the comment in this way says more about you than anything else. Because that's not what I or the parent comment said.


and your conclusion on this situation says a lot about your current state of economic privilege and/or ignorance


The trick is to compartmentalize


I'd got a step further and say that in business software, named parameters are preferable for all but the smallest functions.

Using curried OR tuple arg lists requires remembering the name of an argument by its position. This saves room on the screen but is mental overhead.

The fact is that arguments do always have names anyway and you always have to know what they are.


I want to agree, but there is the tension that in business code, what you pass as arguments is very often already named like the parameter, so having to indicate the parameter name in the call leads to a lot of redundancy. And if you’re using domain types judiciously, the types are typically also different, hence (in a statically-typed language) there is already a reduced risk of passing the wrong parameter.

Maybe there could be a rule that parameters have to be named only if their type doesn’t already disambiguate them and if there isn’t some concordance between the naming in the argument expression and the parameter, or something along those lines. But the ergonomics of that might be annoying as well.


This is an issue in Python but less so in languages like JavaScript that support "field name punning", where you pass named arguments via lightweight record construction syntax, and you don't need to duplicate a field name if it's the same as the local variable name you're using for that field's value.


That forces you to name the variable identically to the parameter. For example, you may want to call your variable `loggedInUser` when the fact that the user is logged in is important for the code’s logic, but then you can’t pass it as-is for a field that is only called `user`. Having to name the parameter leads to routinely having to write `foo: blaFoo` because just `blaFoo` wouldn’t match, or else to drop the informative `bla`. That’s part of the tension I was referring to.


I write plenty of business code, and I do not like even the possibility of a mistake like:

    fn compute_thing(cost: whatever, num_widgets: whatever) -> Whatever;

    let cost = …;
    let num_widgets = …;
    let result = compute_thing(num_widgets, cost);
(This can by most any language including Haskell or Lean, with slightly different syntax.)

One can prevent this very verbosely with the Builder pattern. Or one can use named parameters in languages that support them.

An interesting analogue is tensor math. In Einstein’s work, there were generally four dimensions and you probably wouldn’t lose track of which letter was which. In linear algebra, at least at the high school or early undergrad level, there are usually vectors and tensors and, well, that’s it. But in data crunching or modern ML, tensors have all kinds of cool axes, and for some reason we usually just identify them by which slot they are in the order that they happen to be in in the input tensor. Some people try to creatively make this “type safe” by specializing on the length of the dimension, which is an incomplete solution at best. I would love to see adoption of some solution that gives these things explicit names and does not ever guess which axis is being referenced.

(I find 95% of ML code and a respectable fraction of papers and descriptions to be locally incomprehensible because you need to look somewhere else to figure out what on Earth A • B' actually means.


OCaml has a neat little feature where it elides the parameter and variable name if they're the same:

  let warn_user ~message = ... (* the ~ makes this a named parameter *)

  let error = "fatal error!!" in
  warn_user ~message:error; (* different names, have to specify both *)

  let message = "fatal error!!" in
  warn_user ~message; (* same names, elided *)
The elision doesn't always kick in, because sometimes you want the variable to have a different name, but in practice it kicks in a lot, and makes a real difference. In a way, cases when it doesn't kick in are also telling you something, because you're crossing some sort of context boundary where some value is called different things on either side.


I agree, but absence of evidence is not evidence of absence, and we currently have a lot of developers who feel very productive right now.

We are very much in need of an actual way to measure real economic impact of AI-assisted coding, over both shorter and longer time horizons.

There's been an absolute rash of vibecoded startups. Are we seeing better success rates or sales across the industry?


> "absence of evidence is not evidence of absence"

That's the same false argument that the religious have offered for their beliefs and was debunked by Bertrand Russell's teapot argument: https://en.wikipedia.org/wiki/Russell%27s_teapot


I'm not speaking of burdens of proof about unfalsifiable statements.

I'm saying that I think this is an important enough question that I think we should seek real evidence in either direction, especially since apparently everyone already has a strong opinion (warranted or not.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: