No. It's not more AI. The solution is designing and sticking to development process that is more resilient to errors than the one that's currently happening. This isn't a novel idea. Code reviews weren't always part of the process, neither was VCS, nor bug tracker etc.
The way AI is set up today, it's trying to replicate the (hopefully) good existing practices. Possibly faster. The real change comes from inventing better practices (something AI isn't capable of, at least not the kind of AI that's being sold to the programmers today).
What better practices do you mean? Are you saying we just need different more agentic-friendly practices that ensure scaled reliability beyond what we can manually check? If so I totally agree.
AI is 100% capable fundamentally of making new processes. Look I mean it’s not like I think opus 4.7 is all you need, but how can you argue with the fact that adoption since 4.5 has been an inflection point? That’s kind of proof that reliability has reached a level that serious usage is possible. That’s over a period of months. When you zoom out further you see this is extremely predictable even a few years ago, despite the absolute hissy fits thrown on HN when CEOs began saying this.
Agentic coding is verifiable and this implies there are very few practical limits to what it can do. Combine that with insanely active research on tackling the remaining issues (hallucinations — which are not a fundamentally unsolvable problem at a practical level, context rot, continual learning etc)
I literally listed examples above... Code reviews weren't the norm until some time around 2010-ish. Then programmers realized that reviews help improve the code quality, and, eventually, this became so popular that today virtually everyone does it.
Anyways, I'll give an example from something that I've personally experienced / contributed to, which isn't as massive of a thing as code reviews, but is in the same general category.
Long ago, Git didn't have --force-with-lease option. Few people used `git rebase` command because of that (the only way this would work is if using it later with --force, which could destroy someone else's work). In the company I worked at the time, we extended Git to have what was later implemented as --force-with-lease. Our motivation was the need for linear history and some other stricter requirements on the repository history (s.a. every commit must compile, retroactive modifications in response to tests added later etc.)
This is an example of how a process, that until then was either prone to accidental loss of programmer's work or would result in poorly organized history was improved by inventing a new ability. This is also an example of something AI doesn't do, because, at its core, it's a program that tries to replicate the best existing tools and practices. It won't imagine a new Git feature because it has no idea what it could possibly be because its authors don't know that either.
> opus 4.7 is all you need, but how can you argue with the fact that adoption since 4.5 has been an inflection point?
Right no I understand what you mean, I asked to be sure and you’ve confirmed my understanding.
I think we’re talking past each other because your comment is like 99% interesting and insightful and also I agree with it completely but there is only one part of your claim that I have an issue with which is
> It won't imagine a new Git feature because it has no idea what it could possibly be because its authors don't know that either.
I left comments in other threads with a lot of detail but this is a fairly common misconception. It is true in a sort of practical sense today, and I have many experiences as you do with respect to this, but the gist is: this is a world of RL with verifiable rewards, you are not bounded by human ability at all and that is why we have the adoption, funding, and frothy excitement. It is not simply mimicking human coding. In early stages it will because human programming traces are used as kind of a bootstrap to get to an RL phase without any limitation on performance. This is a very well studied field and it just isn’t that much of a question of if and now it’s not even really a question of when.
> What did it invent?
This is a perpetual question with constantly moving goal posts so I’ve given up convincing anyone but by now it’s solving unsolved Erdos problems, not sure how convincing you find that (not opus though but that hardly matters now)
The point I’m trying to make is: we aren’t there yet but it’s a crazy idea to think that isn’t imminent given all of the measurement and observations we have.
Additionally my point on 4.5 being a turning point is adoption. You wouldn’t see adoption numbers if we were not accelerating rapidly from say 3.x performance along the scaling trend that we’ve known for years now
> systemic tech debt is now addressable at scale with LLMs.
Is there any reason to believe this? I've only seen the evidence of the contrary so far.
My experience with AI coding aides is that they, generally:
1. Don't have an opinion.
2. Are trained on code written using practices that increase technical debt.
3. Lack in the greater perspective department, more focused on concrete, superficial and immediate.
I think, I need to elaborate on the first and explain how it's relevant to the question. I'll start with an example. We have an AI reviewer and recently had migrated a bunch of company's repositories from Bitbucket to GitLab. This also prompted a bunch of CI changes. Some projects I'm involved with, but don't have much of an authority, that are written in Python switched to complicated builds that involve pyproject.toml (often including dynamic generation of this cursed file) as well as integration with a bunch of novelty (but poor quality) Python infrastructure tools that are used for building Python distributalbe artifacts.
In the projects where I have an authority, I removed most of the third-party integration. None of them use pyproject.toml or setup.cfg or any similar configuration for the third-party build tool. The project code contains bespoke code to build the artifacts.
These two approaches are clearly at odds. A living and breathing person would either believe one to be the right approach or the other. The AI reviewer had no problems with this situation. It made some pedantic comments about the style and some fantasy-impossible-error-cases, but completely ignored the fact that moving forward these two approaches are bound to collide. While it appears to have an opinion about the style of quotation marks, it completely doesn't care about strategic decisions.
My guess as to why this is the case is that such situations are genuinely rarely addressed in code review. Most productive PRs, from which an AI could learn, are designed around small well-defined features in the pre-agreed upon context. The context is never discussed in PRs because it's impractical (it would usually require too much of a change, so the developers don't even bring up the issue).
And this is where real large glacier-style deposits of tech debt live. It's the issues developers are afraid of mentioning because of the understanding that they will never be given authority and resources to deal with.
You are not wrong about anything you’re saying but like I said this misses the forest for the trees. I’m talking about like the next ~2 years. There is a common idea that we don’t understand this technology or what will happen performance wise. We know a lot more about what’s going to happen than people think. It’s because none of this is new. We’ve known about neural nets since the 40s, we know how RL works on a fundamental level and it has been an active and beautiful field of research for at least 30-40 years, we know what happens when you combine RL with verifiable rewards and throw a lot of compute at it.
One big misconception is that these models are trained to mimic humans and are limited by the quality of the human training data, and this is not true and also basically almost entirely the reason why you have so much bullishness and premature adoption of agentic coding tools.
Coding agents use human traces as a starting point. You technically don’t have to do this at all but that’s an academic point, you can’t do it practically (today). The early training stages with human traces (and also verified synthetic traces from your last model) get you to a point where RL is stable and efficient and push you the rest of the way. It’s synthetic data that really powers this and it’s rejection sampling; you generate a bunch of traces, figure out which ones pass the verification, and keep those as training examples.
So because
- we know how this works on a fundamental level and have for some time
- human training data is a bootstrap it’s not a limitation fundamentally
- you are absolutely right about your observations yet look at where you are today and look at say Claude sonnet 3.x. It’s an entire world away in like a year
- we have imperfect benchmarks all with various weaknesses yet all of them telling the same compelling story. Plus you have adoption numbers and walled garden data that is the proof in the pudding
The onus is on people who say “this is plateauing” or “this has some fundamental limitation that we will not get past fairly quickly”.
> look at say Claude sonnet 3.x. It’s an entire world away in like a year
In the area I work I find them to be of very little value both then and now... I see no real difference. They help in marginal tasks. Eg. they catch typos, or they help new programmers to faster explore the existing codebase.
So far, I haven't used a single line of code generated by AI, even though I've seen thousands. Some of them worked to draw attention to a problem, but none solved it successfully. It was all pretty lame.
I see no reason to believe it's going to get better. Waving hands more forcefully isn't helping, there's no argument behind the promise of "it will get better". No reason to believe it will...
But, more importantly, the AI is applied on a level where really important things don't happen. It's automating boilerplate work. It doesn't make decisions about the important parts. Like, in the example above, the AI is not capable of choosing a better strategy: use pyproject.toml or write code to build Python packages? It's not the kind of decision it's called to make and nobody sensible would trust it to make such a decision because there isn't a clear right or wrong answer, only the future will prove one or the other to be the right call.
> So far, I haven't used a single line of code generated by AI, even though I've seen thousands. Some of them worked to draw attention to a problem, but none solved it successfully. It was all pretty lame.
I find this statement highly suspect. AI coding agents nowadays can spot subtle object lifetime management issues and even dependency lifecycle incompatibilities, and here you are stating you are unable to use them to fix things? How strange.
Not to mention that coding agents excel at creating greenfield projects and migrating whole frameworks.
But if you feel you can't use them then I feel sorry for you.
I think if you honestly don’t believe there is a major difference between 3.x and 4.7 I don’t think there is much anyone will be able to do to convince you. I do find it disappointing when technical professionals are so disinterested in building a real understanding of a fairly complex topic.
> I see no reason to believe it's going to get better. Waving hands more forcefully isn't helping, there's no argument behind the promise of "it will get better".
That’s a real bummer to read that from someone who sounds like a professional, and not only a professional but someone thoughtful and smart. 30 years of brilliant work in RL, Bayesian stats, machine learning, measurement, and then trillions of dollars of funding and some of the best talent in the world, and your assertion is “I tried it on my codebase and I didn’t like it and that trumps literally entire fields of mathematics and statistics”. I mean, have you heard of Chinchilla scaling laws? Do you know how RL works? are you aware of benchmarks, their strengths and weaknesses? Are you following adoption numbers, accomplishments like new proofs of unsolved erdos problems?
> But, more importantly, the AI is applied on a level where really important things don't happen. It's automating boilerplate work.
Your experiences are your experiences, I don’t know what work you do or how it gets done, what languages you’re working with etc. but literally we’re at the point where the vast majority of code at major tech companies is fully AI written (not assisted).
> It's not the kind of decision it's called to make and nobody sensible would trust it to make such a decision because there isn't a clear right or wrong answer
What are you claiming is not fundamentally possible for an AI to do that a human can do here? People make judgement calls on ambiguous problems, taking into account vast amounts of context about the business, dev time, reliability, maintenance, etc; why do you think AI can’t do that?
You don't know buzzword A, B, C? Heh, he must be incompetent and know nothing.
The buzzwords mean nothing, really. The math is the same for a stupid or a smart model, because the model is trying to mimic properties of the training dataset.
You can give me the ultimate model architecture that will beat every model in existence and I can still figure out a way to make it perform worse than what's available today, but you're not even doing that, you're just drumming up some old news.
If someone "threatened" me with tech advancements I would be more worried about things like an imminent massive drop in token costs for bigger context windows or other game changers like continual learning where the model internalizes your code base into its weights rather than just keeping it in its context.
It’s not buzzword bragging they are the prerequisites to having an coherent conversation. If someone doesn’t know what chinchilla scaling laws are the discussion about “I think things are saturated” is not grounded in anything. It’s like sitting around debating quantum mechanics and you don’t know the math, it’s just meaningless. If these sound like buzzwords the implication is not “you’re an idiot” it’s “you are not yet informed on the key basics of the discussion” and that is something you can fix with curiosity and a couple of prompts to ChatGPT to speed up the learning curve. It’s not like any of this stuff is gatekept.
> You can give me the ultimate model architecture that will beat every model in existence and I can still figure out a way to make it perform worse than what's available today, but you're not even doing that, you're just drumming up some old news.
Sorry I don’t understand what you’re saying here — what is the old news? You can break new models — yes. What’s the point you are trying to make here?
> If someone "threatened" me with tech advancements I would be more worried about things like an imminent massive drop in token costs for bigger context windows or other game changers like continual learning where the model internalizes your code base into its weights rather than just keeping it in its context.
I also don’t really know the point you’re trying to make here — like token cost drops seem like a good thing? Bigger context window too? Are we saying the same thing here?
But also: with age more and more doors are closed to you. Many hobbies become inaccessible. You may end up with a bunch of choices that all just sound outright depressing. Losing a job is losing one more choice, restricting yourself to the possibly more boring options that you can still physically pull off.
> It's amazing to me that we've spent decades with programming languages and environments which can accurately guess what you're about to type next, which have enormous expressiveness
You've almost guessed the problem. Too much expressiveness is a bad thing. This is a problem I encounter a lot more often then I'd be happy to. It's very often is much easier to build something more generic than what the user actually needs, and then testing it becomes a nightmare.
To make this more concrete, here's a case I'm working on right now. Our company provides customers with a tool to manage large amounts of compute resources (in HPC domain). It's possible to run the product on-prem, or in different clouds, or a combination of both. Typically, the management component comes with a PXE boot and unfolds from there. A customer wanted integration with a particular cloud provider that doesn't support this management style, nor can it provide a spare disk to be used for management, nor any other way our management component was prepared to boot.
The solution was to use netboot that would pre-partition the disk and use the first N partitions to store the management component as well as the boot, ESP / bios_grub partition etc. It had to be incorporated into the existing solution that encompasses partitioning and mounting all the resources available to a VM, including managing RAIDs, LVM, DM and so on.
The developers implemented it as a GPT partition name with a pre-defined value that would instruct our code to ignore the partitions found prior to the "special" partition and allow the user to carry on as usual, pretending that the first fraction of the disk simply didn't exist (used by netboot + the management component).
This solved the immediate problem for the user who wanted this ability, but created thousands of problems for QA: what happens if there's a RAID that uses the "hidden" partitions? What happens if the user accidentally creates second /boot partition? What happens if the user wants whole-disk encryption? And so on. It would've been so much better if these questions didn't exist in the first place, than to try to answer them, given the "simple" solution the developers came up with.
If you programmed for just a year, I'm sure you've been in this situation at least a few times already. This is exceedingly common.
* * *
There's an enormous value to being able to restrict the possible ways a program can run. Most GUI projects? -- They don't need infinite loops! It just makes programs unnecessarily hard to verify. But it's "easy" to have a single loop language element that can be made infinite if necessary. Configuration languages exclude whole classes of errors simply by making them impossible to express.
However, I have to agree that, specifically, YAML is a piss-poor configuration language. It has way too many problems that overshadow the benefits it offers. We, collectively, decided to use it because everyone else decided to use it, making it popular... and languages are "natural monopolies". So, one could certainly do better ditching YAML, if they can afford to go unpopular. But ditching the idea of a configuration language is throwing the baby out with the bathwater.
PostScript was the first language I ever used professionally! :P
At the time, I worked for a printing house in Kyiv that specialized in accidental printing (screen printing, flexo-, tampo- etc. i.e. mostly printing on weird curved surfaces, not paper). The triad (full-color) screen printing was all the rage (early-mid 90s). Part of the process of generating the films that were later used to irradiate the polymer layer covering the screen mold was bound to a bootleg Scitex machines IDF used for printing maps. While we had the machines, we didn't have a proper driver that could take a color image, separate it into channels and instruct the machines to produce the films. So, I'd produce PS files from, eg. Photoshop (also bootleg...) and then edit the PS files by hand to match the requirements from the Scitex machines.
I wasn't a programmer by training, and doing all this stuff absolutely felt like magic. Something I will never experience with computers again :'(
Yes and no. We shouldn't compare datacenter water usage to residential water usage. We should compare it to industrial water usage, as that is what it is. The question like "how does datacenter water cooling compares to concrete factory water cooling?" makes some sense from engineering perspective, as you are comparing oranges to oranges to a degree.
Residential water usage is way too different in way too many ways to be meaningfully compared to industrial usage. The scale is different, the waste water treatment is different, the infrastructure cost is different. The water quality standards are different...
In my days in art academy, the running joke was that
If you were accepted into the painting faculty, you were an artist,
If you were accepted into graphics faculty, you were color-blind,
If you were accepted into sculpture, you were blind,
If you were accepted into art history, you couldn't be taught to draw.
While a little cruel... (I was in the graphics), the general idea was to say that art theory, art history, and especially psychology studies around art are absolute rubbish. These people seem to get into their line of work because they failed as artists (and don't understand / can't produce art).
Likewise, in this article, the approach to defining creative thinking is... so simplistic, and the test is so irrelevant...
Just to try to give you some background as to why a student could choose one approach or the other: if a student wasn't told why they need to draw a still life, they probably didn't care much for the outcome either. Artists rarely know why they prefer one composition over the other, especially in academic studies like... still life. To an artist, the selection of objects for a still life is really arbitrary, their arrangement is arbitrary -- it makes no difference. To make an interesting still life, one would have to find something that would interest other artists in it. Like, for example, how one can show different textures of the objects of the same nominal color using color? Or... would a technique that models volume through the thickness / intensity of contours work on mostly round objects? And so on.
Later, the article is trying to assess the artist's accomplishments in ways artists would frown upon. The number of exhibitions? The sales in prestigious galleries? Yeah... as a student I spent some time working in the lab of Kadishman (the guy who draws the same sheep over and over, and then sells it for insane $$). The "master" doesn't even draw the sheep anymore. It's all Shinkar / Bezalel students who do it :D And, honestly, the sheep is one of the biggest frauds I've personally witnessed in this profession (there are, of course, things like the diamond skull from Damien Hirst, which are more expensive because of the materials used, but I didn't have a chance to behold the miracle with my own eyes).
> Looking back ten years to `left-pad`, are there more successful attacks now than ever?
I can't vouch for the number of attacks, but, and since we are talking about Python, nothing substantially changed since the time of `left-pad`. The same bad things that enabled supply chain attacks in Python ten years ago are in place today. However, it looks like there are more projects and they are more interconnected than before, so, it's likely that there are either more supply chain attacks, or that they are more damaging, or both.
Here's my anecdotal experience with Python's packaging tools. For a while, I was maintaining a package to parse libconfuse configuration language. It started as a Python 2.7 project, but at the time there was already some version of Python 3 available, so, it was written in a way that was supposed to be future-proof.
I didn't need to change the code of the project in the last ten or so years, but roughly once a year something would break in the setup.py. Usually, because PyPA decided to remove a thing that didn't bother anyone.
When Python 3.13 came out, as clockwork, setup.py broke. I rolled up my sleeves and removed the dependency on setuptools, instead, I wrote some Python code that generated a wheel from the project's sources. I didn't look up the specification of the RECORD file in dist-info directory, and assumed that sha256().hexdigest() will generate the checksums in the desired format. And that's how I shipped my packages...
Some time later, the company added an AI reviewer to the company's repo and it discovered that instead of hexdigest() the checksums have to be base64-encoded and then padding removed...
Now, to the punchline: nobody cared. The incorrectly generated packages installed perfectly fine without warnings. Nobody checks the checksums.
More so: nobody checks that during `pip install` or the more fancy `uv pip install` the packages aren't built locally (i.e. nobody cares that package installation will result in arbitrary code execution). It's not just common, it's almost universal to run `pip install` on production machines as a means of deploying a Python program. How do I know this? -- The company I work for ships its Python client as a... source package. Not intentionally. We are just lazy. But nobody cares.
> It's not just common, it's almost universal to run `pip install` on production machines as a means of deploying a Python program.
Maybe a Python culture problem; maybe a hallmark of Python's status as an "easy to hire for", manager-friendly, least common denominator blub language; maybe a risk that stems from the conveniences of interpreter languages... but this is such a shame in this day and age.
It's seriously not difficult to do better. And if this is what you're doing, you're also missing out on reproducible environments both in dev and in prod. At least autogenerate a Nix package! You still don't need to publish any artifacts, but you can at least have the thing build in a sandbox or yeet the whole closure over SSH.
It's also not that hard to get a Docker image out of a Python project.
You only need one platform-minded person on the whole development team to make this happen.
"Almost universal" is a bit of a stretch, most of the time these days Python apps are deployed as Docker containers, and if you're using k8s this becomes effectively mandatory.
However a lot of the time especially for older codebases the docker build will just run pip install from public pypi without a proper lockfile.
So at least install code isn't being executed on your production machine, but still significant surface area for supply chain attacks
Well, the install code can leave some code behind that will be executed on the production machine... It doesn't really help being in a container. While a separate problem from Python ecosystem, people really put a lot more faith in isolation offered by containers than they should. Also, it's often very tempting to poke holes in that isolation because it's difficult and up to impossible sometimes to get things done otherwise.
As scary as it is right now, it warms your heart a little bit that this system existed for 30 years and is only now reaching a crisis point.
I ran an open source project with tens of thousands of downloads (presumably all either developer machines or webservers, so even a small number is valuable) and never received a malicious pull request, offer of a bribe to install malware, or a phishing attempt with enough effort to even catch my attention.
What it says to me is that there weren't a lot of people working on the crime side of this. It's like dropping your wallet in a bar bathroom and coming back to find it still there.
It's probably the same people, who think that merely having a requirements.txt stating packages with versions or even without that (2010 sends its regards) is fine. Open a random open source Python project on GitHub, and chances are you will see this kind of thing. Stands to reason, that people in companies are not acting much different.
virtualenv isn't relocatable out of the box, so how else would you deploy a python project?
You can call it laziness, but it's not like the python ecosystem has ever developed an answer for this problem. The only reasonable answer has been to use docker, which is basically admitting that the python community did nothing.
Oh, that's a sore spot with me, but I'm glad you asked!
So, for the purpose of full disclosure, I have a personal and professional grudge with PyPA, which also touches on how pip is being managed, beside other packaging issues. It's not the side you want to be on, so, be warned!
So, without further ado: I write my own code to generate the deployed artifact. In my case, I take all the wheels installed in my environment, extract them, and merge them into a single wheel. The process also usually involves removing a bunch of junk from the packages packaged in such a way. You'd be surprised how much nonsense people put in their distributed packages... like, their unit tests, or documentation in HTML / PDF format, __pycache__ files (together with the sources)... the list goes on.
But, it works because I curate what's being installed. I don't trust pip to install just or everything I need. I run it in a separate environment, where I examine the packages that have been installed as dependencies, figure out why any of these packages were installed (you'd be surprised how often you don't need them!), then, I make a list of the dependencies I actually need, with the exact versions and checksums, and use a Python or a Shell script to download and install them in the actual development environment.
This isn't a good idea when you have many short-lived projects, but, in my case, the typical project lifespan is measured in decades and there aren't that many of them. So, I can expend the extra effort required to do that.
Unfortunately, I don't think there's a way to automate the process. The key point is that there's a human who sifts through dependencies and figures out what to do with them. Partially automate, maybe... but I can't think of a way to make this into a program that I could give someone.
> virtualenv isn't relocatable out of the box, so how else would you deploy a python project?
My team has a handful of Python projects. Here's how they work:
devenv.nix provides a Python runtime and all native dependencies, git hooks for linters and things like this. It integrates with direnv and the Python package manager (currently Poetry 1.x for older projects and uv for newer ones) so that when you cd in you get a virtualenv with everything you need, scripts in the project (or stubs for them) magically appear on your PATH so you don't need to use `uv run` or whatever it is for anything.
flake.nix provides a publishable artifact for projects that we run on workstations or servers. It autogenerates a Nix package from pyproject.toml and friends. You can reproducibly build it across platforms without virtualization, you can push it up to a binary cache and avoid source builds, whatever. It's great.
For projects that we run in cloud-native containers (for us AWS Fargate and AWS Lambda), we don't currently ship our own container images. We just publish zip files that we generate with a Poetry plugin that runs builds inside containers that have the same images as are used by AWS in its default runtime environments and push them up with the AWS CLI. The exact steps are stored as a Devenv script so the CI can be a one liner and you can run everything locally just like you would in CI.
> the python community did nothing
Python sucks.
But you can still represent your Python project as a proper Python package and get reproducible-ish build artifacts that are local-first and embrace Python-native tooling and ship it up to prod in a portable format with or without Docker. It only takes one engineer spending a day or two to work it out once for the whole team or maybe the whole company. You just need someone to be willing to RTFM on a package manager or two. The Python community seems to be largely lacking such people but your team doesn't have to be.
I don't believe monads are a "heavy handed abstraction" and that's what prevents people from prototyping in Haskell.
What really prevents people from writing in Haskell at a reasonable speed is the poor language design. Programming languages are supposed to aid in reading by emphasizing structure. It's important to emphasize that a particular group of "words" constitutes a function call, or a variable definition, or a type definition -- whatever the language has to offer.
Haskell is a word salad. Every line you read, you have to read multiple times, every time trying to guess the structure from the disconnected acronyms. It belongs to the "buffalo buffalo buffalo buffalo" gimmick family. This is a huge roadblock on the way to prototyping as well as any other activity that implies the ability to read code quickly. And then it's also spiced by the most bizarre indentation rules invented by men.
This is not at all a problem with eg. SML or Erlang, even though they are roughly in the same category of languages.
Haskell would've been a much better language if it made its syntax more systematic and disallowed syntactical extensions s.a. introduction of user-invented infix operators, overloading of literals (heaven, why???) and requiring parenthesis around function arguments both for definition and for application. The execution model is great, the typesystem is great... but the surface, the front door to all these nice things the language has is just some amateur level nonsense.
* * *
As for the upsides of using languages from the Lisp family for practical problems... I don't find (syntax-rules ...) all that exciting. I understand this was an attempt to constrain the freedom given by Common Lisp macros, and I don't think it worked. I think it's clumsy and annoying to deal with. The very first time I tried to use it, I ran into its limitations, and that felt completely unjustified. To prototype, you want freedom of movement, not some pedantry that will stand in your way and demand you work around it somehow.
The absolute selling point, however, is SWANK. Instead of editing the source code, you are editing the program itself, that can be interacted with in points of your choosing. I don't know of any modern language that offers this kind of experience. I think, even still in the 80s, this approach to programmers interacting with computers was common. At school, we had terminals with some variety of Basic, and it worked just like that: you type the program and it instantly shows the effect of your changes. Then, there was also Forth, which also worked in a similar way: it felt like you are "talking" to the computer in a very organized and structured way, but real-time.
Most mainstream languages today sprouted from the idea of batch jobs, where the programmer isn't at the keyboard when the program runs. They came with the need to anticipate and protect the programmer from every minor mistake they might've easily detected and fixed during an interactive session far, far in advance.
Whenever I think about writing in C, or Rust, or Haskell, I imagine being tasked with going to the grocery blindfolded: I'd need to memorize the number of steps, the turns, predict the traffic, have canned strategies for what to do when potatoes go on sale... I deeply regret that programming evolved using this evolution path, and our idea of what it means to program is, mostly, the skill of guessing the impossible to predict future, instead of learning to react to the events as they unfold.
This is not what "subjective" means. You can't argue something is subjective because many people don't agree with an opinion.
When someone argues subjectivity (in a negative sense), they need to show that the opinion does not rely on facts, rather it's based on... nothing (feelings).
I offered a very easy way to numerically assess the negative impact of poor language design choices made by Haskell designers. It's not about what I "feel" about the language: in Java, you write three-words program, and you get, usually, a unique interpretation. In Haskell, you write a three-words program, and you get 9 (nine) possible interpretations. It's impossible for a human to examine nine interpretations simultaneously and figure out which of them are valid and might fit the context. So, reading a Haskell program takes longer and requires more effort than a Java program.
Of course, Haskell programmers find ways to adapt to their misfortune. They try to avoid pathological cases (eg. writing four-words programs, let alone five!), they memorize a lot of acronyms and non-typographical symbols that they later use to prune the search for a possible meaning of the program. They invent conventions on top of the bare language design that constrain the search space for possible programs to make their task easier.
It's absolutely possible that after layers of conventions and a long time spent memorizing various acronyms and symbols, Haskell programmers catch up to speed of programmers in other languages: after all, the superficial difficulties with the language might seem like a small price to pay for the access to the language's riches that lay beyond the surface. The language grammar rules cannot account for the entirety of the performance of the programmers who chose to write in the language.
This situation is very similar to the "universal" (claimed, but not in practice) mathematical language, which is extremely difficult to read, write, edit, typeset... yet the tradition of using it prevails and the overwhelming majority of mathematicians use, and prefer using the "universal" mathematical language even though much saner alternatives exist.
There aren't a lot of Haskell programmers, so "lots" is maybe an exaggeration.
I see OP's point. Haskell feels (or felt, I admit I haven't been keeping up the last 15 years) needlessly obtuse sometimes, like how people love to invent new infix operators all the time.
> Haskell is a word salad. Every line you read, you have to read multiple times, every time trying to guess the structure from the disconnected acronyms. This is a huge roadblock on the way to prototyping as well as any other activity that implies the ability to read code quickly.
I couldn't disagree more. Yes, there is more upfront work understanding Haskell code. But it's very dense. Once you understand the patterns, you can read it much quicker. Just like map/filter/fold are harder to understand then a for-loop, but once you do, you can immediately see what kind of iteration is applied. The for-loop can do all kinds of crazy index manipulation that you always have to digest from scratch.
> And then it's also spiced by the most bizarre indentation rules invented by men.
Again, quite surprised by this criticism. The rule is extremely simple: inner expressions must be indented more. You're free to decide by how much. That's why there are many "styles" out there. Maybe that's what you mean with bizarre. But it's not like the language is forcing weird constraints on you. If anything the constraints are too lax. Any other language with non-mandatory indentation allows that as well. In general, I really don't understand why not more languages do mandatory indentation. You only need curly braces and semicolons if you want the option to write a whole if/else/while/... statement in one line. But nobody does that.
Not to support the parent comment, which I disagree with, but If you use multi-line let-bindings, those require that you indent not just more than the previous line, but as much as the first token after the let keyword on the previous line. It’s a very strange rule, all the more surprising because it’s inconsistent even with the rest of the language. It is totally avoidable if you, like I think most experienced haskellers do, just prefer ‘where’, but people more familiar with procedural code usually lean into using ‘let’ everywhere because it feels more familiar.
I think the strange indentation used to be required in more places - I vaguely remember running into it a lot more when I started with Haskell 20 years ago, but that was also just when I was new to the language. These days I just keep ‘let’ to a bare minimum, so it doesn’t bother me. One thing that made Elm frustrating was that it disallowed ‘where’ clauses, forcing you to deal with this weird edge case all the time.
No, the issue is if the first binding is on the same line as the `let`, you are required to write, e.g.:
someValue = let f = 9
fo = 10
foo = 123
in f+fo+foo
rather than:
someValue = let f = 9
fo = 10
foo = 123
in f+fo+foo
I think it used to be the case that it had to be indented past the `=` or the `let` even if it was't on the same line. Note also that `in` has to be indented past `someValue`, but doesn't need to be indented as far `let`.
This is fine:
someValue = let
f = 9
fo = 10
foo = 123
in f+fo+foo
So, it is possible to land on sane indentation, but the parser is much pickier than, e.g., Python's off-sides rule, so it takes some trial and error for new users to find it, and it can be frustrating if you're just temporarily modifying an expression to quickly try something out.
I honestly think it would be less surprising if the parser just disallowed writing the first binding on the same line as the `let` entirely, treating it only as a block, but some people (bewilderingly) do seem to prefer to write their code with the excessive indentation (I'd imagine with editor support, rather than manually maintaining the spacing).
I feel like you are describing that the parser is too lenient rather than too picky. It could just require you to always put `let` and `in` on their own lines, in which case the indentation makes sense, I think. It's only when trying to keep more stuff on the same line that the details of Haskell's indentation rules come into play.
> It's important to emphasize that a particular group of "words" constitutes a function call, or a variable definition, or a type definition -- whatever the language has to offer.
For background: my first time in college, I was studying typography. An integral part of this trade is figuring out what is easier for people to read by answering questions s.a. what is the best line length, what number of columns per page is the best, what number of ascent elements per font face is the best, considering letter frequencies and coincidence and so on.
It also comes with the editing part, as in the trade of taking a manuscript (a text intended to be published) and making sure that the text meets certain reader expectations in terms of consistency, clarity, structure. This, obviously, includes the use of punctuation, but it's more about the language structure, things like adjectives order or anaphora usage etc.
Programming languages can be judged using the same rules, because, in the end of the day, we read them and need to interpret them. People have particular strengths and weaknesses when it comes to reading: we can remember the anaphora's anchor for only so long, we can hold only so many "variables" in fast-to-access memory, we only can do so many levels of adverb phrase nesting and so on.
Haskell was designed by someone completely oblivious to human abilities to read. It's very demanding and straining when it comes to extracting structure from text in the same way how, in English, you'd struggle to extract structure from so-called "garden path" sentences, because it's intentionally obfuscated. I don't believe Haskell is intentionally obfuscated, instead, I attribute the poor performance to the lack of awareness on the part of the author.
To convey the same point by means of example: Haskell is almost uniquely bad in that given a program
A B C
the programmer can't tell if the program is actually A(B, C), or B(A, C), or C(A, B), or A(B(C)), or A(C(B)), or (A(B))(C), or (B(C))(A), or (B(A))(C), or (C(B))(A).
There's absolutely no reason a language should offer these kinds of puzzles, especially in a very large quantity as Haskell does. Removing this "feature" would make the language a lot easier to work with.
In Haskell it's only ever one of (A(B)(C) or (B(A)(C), and you can tell which based on which characters B is made up of. If B starts with one of !#$%&*+./<=>?@\^|-~` it's the second situation, otherwise it's the first.[0] All functions are unary in Haskell so A(B, C), B(A, C) and C(A, B) can never actually happen. The cases where it looks like A(B(C)), etc. are happening are actually cases of (B(A)(C), e.g. f $ g is a (B(A)(C) case where B=$. So the basic syntax of Haskell is actually very simple and consistent, but due to lazy evaluation the functions can affect control flow much more than in other languages.
0: OK, there are some additional non-ASCII Unicode symbols, but everything but string literals should be kept ASCII IMO.
> the programmer can't tell if the program is actually
What do you mean, "can't tell"? If I see this in Python
(A)(B)(C)
how do I know which of your 9 it means? Well, I'm a Python programmer so I know that it means
A(B)(C)
which is the function A applied to B, which returns a function that gets applied to C. If you're a Haskell programmer you know that it means the same thing.
I grant you that it is odd to those who are unfamiliar and it took me quite a while to get used to it, but it's much better to write that way in Haskell when writing programs that use higher-order functions.
Mmm.I think I understand where you are coming from. You can write incomprehensible code in Haskell very easily and I agree that some people tend to write Haskell in a way that is easy when writing but very hard during reading.
But that is a choice. I prefer not using complex function compositions and the lenses due to this, split complex expressions into a bunch of let bindings etc..
So you also can write very readable code in Haskell.
It's just not good because you need to work around its limitations, whatever its purpose is. Not good for prototyping because it's the red tape you need to cut to get work done. Red tape isn't, in general, a bad thing, but when it comes to prototyping it is.
I think most people misunderstood syntax rules. It was not meant as the macro system for scheme. It was meant as the template macro system everyone could agree on, while leaving the more powerful low level macro systems to the implementations. Syntax case, or explicit/implicit renaming or syntactic closures or what have you.
From your last paragraph, I am curious which languages / paradigms you advocate for. Sorry it wasn't clear to me except that you like SWANK, which I'm not familiar with.
I wish there was some sort of a single metric that would allow measuring languages against each other and thus determining the best one. Unfortunately, there are multiple variables and the relationship between the variables is unclear. But, going totally with my gut feeling, some examples of good languages (in terms of ease of reading) include:
* Prolog (and, by extension, Erlang).
* Pascal.
* Java 5 and earlier (and Go, as it's almost a Java's twin).
These languages somehow manage to hit the sweet spot of enough system and enough diversity, few unexpected syntax constructs (eg. Pascal or Java have the "dangling else" problem, but it's manageable compared to the problems introduced by optional statement delimiters in Go or JavaScript for example). In every case, a programmer must program defensively against these sorts of language "pathologies".
To give some examples of questionable or outright bad design decisions:
* In Common Lisp (and Scheme as well as a number of similar languages) there's a problem with identifying the open parenthesis that will be closed by typing the closing parenthesis. Programmers must invent tools and techniques to manage this problem.
* In C++, there's a laughable (or, at least was, for a long time) rookie "whoopsie" when it comes to ">>" in templates vs infix operator. And the "solution" offered by the language designer makes you think they were just... lazy (add space).
Here are also examples of some (perhaps, accidentally) good decisions:
* Kebab-case in many Lisp family of languages. In Latin script, the position of the hyphen in the middle of the lower-case letter is a better choice then, eg. underscore (which is tutted to be a "not a typographic character"). Same reason why, eg. in traditional Hebrew hyphens are at the height of a capital letter (Hebrew doesn't have lower-case letters and the shape of letters is better suited for hyphens at the top rather than the middle).
* Clojure as well as Racket (afaik, deliberately) introduced more kinds of parenthesis-like delimiters to make it easier to guess which expression is being terminated by the currently typed delimiter.
* * *
Note that this is a "superficial" metric, because languages are also valuable for concepts they are able to express both in terms of program logic as well as program application to the hardware it manages; the ability to process, modify, generate, analyze the language automatically; the ability to constrain the language to a desired subset of all available operations... Incorporating all of these into a single metric seems like mission impossible :)
> Are you mixing tabs and spaces? Maybe an example here would help.
This is not what "rules" means. Rules aren't about what I do. Rules are about what the language treats as legal or illegal. I don't write in Haskell at all because I don't like it and have no use for it, but Haskell rules don't change because of that, they are still mindbogglingly complex when it comes to telling the programmer if the next line is the right amount of space to the right or not. None of that complexity is necessary and could've been totally avoided if the language used statement delimiters.
> No, this is important, so that default strings don't to have to be something crummy.
My argument is that to get a little accidental convenience you sacrificed a huge amount of routine convenience. The mental load of having to distrust a string when you see it is just not worth the accidental convenience of writing a prepared statement and making it appear as if it was a string. In other words, you are the guy who traded a donkey for three beans, but the beans didn't sprout into a huge ladder that took you to the giant's castle. You just made a very watery soup and that was that.
> Again, an example would be helpful.
Look up the example I gave in the adjacent reply.
> I thought lazy execution was widely agreed to be the worst part of Haskell.
It's good because it's unique and, when it fits the purpose, it's useful for that particular purpose and neigh irreplaceable, because it is unique. It's worth having for the sake of research, to understand how languages can be designed and what tools or techniques can be discovered on this path. This is said from the perspective that Haskell is not the end product, but rather a research attempting to study how languages can work and what concepts they can develop.
How would this work with sites like YouTube which allow sharing of content, potentially not appropriate for children, but the content is generated by the site's users? Who will be fined for "violations"? And how would such a fine be levied, especially internationally?
I think that initially the onus would be on Youtube to figure this out. They have some very intelligent engineers. For example, if the Youtube client is receiving affiliate funds then they are easy to ID and fine. If they are random people then Youtube would have to share the violation data with the other countries and the US or UK would have to pressure those countries to participate in fining the end user. There could be financial incentives for the foreign country to participate. They can also just force label a video to be adult as they do today when enough people report it which is admittedly not uniformly applied.
This already has been solved. Youtube disables viewing via embeds for any content that has been age restricted. Either you view it on Youtube which requires logging in to see age restricted content in the first place, or you get the ! icon and the warning about needing to log in.
The way AI is set up today, it's trying to replicate the (hopefully) good existing practices. Possibly faster. The real change comes from inventing better practices (something AI isn't capable of, at least not the kind of AI that's being sold to the programmers today).
reply