LLMs could help if they were specifically applied to the task [1], however peopl...

godelski · on April 12, 2025

I'm highly skeptical. The current ML paradigm is highly reliant on aggregating data, but the issue we're discussing is about distinguishing subtle details over an extremely large search space. Sure, you can probably scale your way there but even accounting for superposition we're talking about an extremely large number of parameters because you aren't performing search, you're performing compression. You need to also remember the curse of dimensionality. The problem is that as the dimension increases the ability to distinguish the nearest neighbor from the further neighbor decreases. Effectively the notion of distance becomes undefined. (The dimensionality increases as parameters increase). So now you have to perform search over your compression.

This is why ML is so fucking cool but it's also why they are really bad at details. Why you have to really wrestle with them to handle nuance. Easiest to see in image generators but they're much smaller. Do remember that these things are specific trained so that their outputs are preferential to humans. The result is that errors are in the direction of being difficult to be detected by human evaluators. Deciding if that's a feature or bug requires careful consideration.

This is not to say that LLMs and ML is useless or trash. They are impressive and powerful machines but neither are they magic and the answer to everything. We got to understand the limitations if we're to move forward. I mean that's the job of us here as researchers, engineers, and developers. Using a keen eye to find limits and then solve them (easier said than done lol)

ezst · on April 11, 2025

Again someone mistaking LLMs with knowledge bases. Must be a day finishing in `y`

PaulHoule · on April 11, 2025

The original misunderstanding behind "knowledge base" was that, in the 1980s, it was an idea in symbolic AI that you'd develop a set of facts against an ontology designed for accurate inference and somehow by the 1990s it became a text repository with a search engine that may or may not work. Occasionally useful, sometimes hard to distinguish from a trash can. See Confluence.

Prompt engineers with their decoder models are going to always be wondering why they are always a bridesmaid and never a bride, with encoder models you can attain the holy grail of the system where you put text in one side and get, within calibrated accuracy, facts to put into the first kind of knowledge base. Or, for that matter, a good search engine for the second kind of knowledge base which could raise it above the "trash can" level.

ezst · on April 12, 2025

"Funny" how that reminisces of the whole blockchain discussion. If the need is fully satisfied by a "boring" and cost-effective "facts" database, why would an adequate engineer push for (blockchain/)LLM instead?

PaulHoule · on April 12, 2025

There were several reasons why "expert system" were rejected in the 1980s including competition with programmable calculators and spreadsheets and no correct paradigm for reasoning with uncertainty but the one most quoted was that the creation of that kind of database is not cost-effective.

I spent about 10 years working (sometimes for myself, sometimes for employers, sometimes part time, sometimes as a software developer sometimes as a business developer) on the problem of turning a mass of text into facts into text to solve problems like:

- Doctors write copious medical notes from which facts would be useful for themselves, payers, researchers, regulators.

- An accounting or legal firm may need to scan vast numbers of documents and extract facts for a audit or lawsuit

- An aerospace manufacturer has a vast database of documentation and maintenance notes (even from the teams at the airports) that it needs to keep on top of

- A fashion retailer wants to keep track of social media chatter to understand how it connects and fails to connect with customers and answer questions like "should we endorse sports star A or B?"

- Police and soldiers chat with each other over XMPP chat about encounters with "the other" which again are rich with entities, attributes, events, etc.

Tasks like this need an interactive system but you face the problem that people have an upper limit of 2000 or so simple decisions [1] in a sustainable day. The problem is large but it is not "boil the ocean" because you can set requirements for what gets extracted and use the techniques of statistical quality control as in Deming to know accuracy is in bounds.

You can give people tools to tag things in bulk, you can apply rules, you can give the people tools to create the rules. I worked on RNN and CNN based models, SVM, logistic, autoencoder and other models and before BERT they all sucked. If you have the interactive framework you can put encoder or decoder LLMs in and it is a revolution that makes systems like that much cheaper to develop and run for better effects.

[1] hot dog/not hot dog