Where Do We Go From Here?

 

LLMs are a joke – useful for toy applications, but fall apart when things get complex - how could they not, when they have no idea of the meanings of words.

The owner of the bar supports the bar on underage drinking.

If an LLM encounters another use of the word “bar”, it cannot look through  a list of meanings – there may be enough to find a connection to “bar” and “underage driinking”, but much of it would be about “drinking bar” and not about bar (prohibition).

The examples given are written in Technical English – used to help the reader to navigate a large document, but also something that does not help an LLM, looking for allusions for words near other words.

Technical English - Anti-Money Laundering

The reader may be given instructions on how to read the document (assuming a human reader).

“security has the meaning given by section 92 of the Corporations Act 2001 (for this purpose, disregard subsections 92(2A), (3) and (4) of that Act).”

One reason why LLMs are useless at this scale – they don’t understand or follow instructions, and they end up with no structure analogous to what a person builds while they read complex text.

 

Looking at the size of the source:

10 pages – the construct of the cabin shall ensure the entry of dust and sand into the cabin. People casn ber very careless.

50 pages – humans run out of puff around 50 pages. The limit on conscious capacity is well known – the Four Pieces Limit.

400 pages – Anti-Money Laundering legislation

Legislation requiring banks to prevent their operations allowing laundering of money. Combatting a nimble and intelligent adversary – bans do it badly, alien to their relatively simple business model – borrow and lend – billion dollar fines are handed out. An area where an AI approach would be useful.

As text grows in size, it increasingly is written in technical English

1,000 pages legislation. Two examples : Affordable Care Act, the Big, Beautiful Bill. Too big to hold in one’s head or understand, except piecemeal. For the Affordable Care Act, many small changes (”fixes”) added later.

2,000 pages – Navy Procurement Manual/Constellation frigate, Undisciplined approach, project cancelled, 2 billion wasted. See

https://www.nytimes.com/interactive/2025/12/11/opinion/editorials/us-military-industry-waste.html

3,000 pages – F-35 undercarriage specification. Navy version needed stronger undercarriage to land on carriers in an emergency with full fuel load – overlooked.

90,000 pages – F-35 complete specification. Many mistakes, hundreds of billions overrun.

There is a crying need for AI in the area – it won’t be satisfied by toy languages requiring translation – almost all the concepts are missing, so it would be a hodge-podge at best.

We don’t need a diversion down a road that leads nowhere – English isn’t that hard – about 50,000 word vocabulary, about 10,000 phrases, about 100,000 senses (literal, figurative, idiomatic). The real effort is in the 8,500 verbs – making operators that do things abstractly, to make the text “alive”. Do it once and it is pretty much done. Yes, new things come along – a working fusion reactor one day – but it will be described in English anyway.

 

Comments

Popular Posts