Problems with LLMs
Prompt Engineering
This is a
dangerous thing to do – someone asks a question, and we massage the question to
give a predetermined answer. It may be well-intentioned, and the answer may be
relevant to the question to start with, but inevitably the massaging will drift
away from the intent of the question.
Absorbing
Local Information
We set up a test
case, where the description of a manatee (a “sea cow” weighing up to 900
kg) was changed to be a cat, and then questions were asked that could be
answered by the information supplied. After promising
The straight
out lying on what it is going to do is a bit of a worry – how do you trust it
after that?
The Rise
of a New Idea
The LLM is
trained by letting it scour text on the internet and build chains for word
propinquity. The problem with that is that existing ideas predominate. A new
idea might come along, but based on word propinquity, it will never get a look
in. People might read the new idea and recognise its import – an LLM is never
going to do that – it doesn’t work on meaning. Taken to the extreme, mass use of
LLMs would herald a new Dark Ages, where nothing ever changes.
Complex
Text
Complex text
such as legislation or project specifications is structured to help people find
their way around. Things like
•
For
the purposes of this Act, a person covered by paragraph (c), (d) or (e) is
taken to be an employee of ASD.
•
information
obtained by an authorised officer under Part 13, 14 or 15;
•
and
includes FTR information (within the meaning of the Financial
Transaction Reports Act 1988).
•
eligible gaming machine venue has the meaning given by section 13.
•
(a) is
covered by item 31 or 32 of table 1 in section 6;
(from Anti-Money Laundering Act)
are par for
the course – these don’t look like useful input for methods which work on word
propinquity.
The size of
the documents is another problem. A piece of legislation can run to a thousand
pages, and the claim that no-one has ever read it in its entirety is entirely
believable. The specification for the F=35’s undercarriage runs to 3000 pages
(a DoD project where wastage in the hundreds of billions occurred, through
inability to understand what the specification was saying – the specifications were
hugely exceeding a human’s comprehension limit, so wastage was the only way to
understand what was required).
Standalone
Document
Some documents
have a long life. There is no point confabulating the specification for Hunter
class frigates (starting to be built “real soon now”) with frigates designed
years ago, nor, when the project may take 10 years, to try to blend 2022
technology with all the whiz bang ideas for Anti-Submarine Warfare in 2030.
More Is
Better
The current
notion is that the problems of LLM can be fixed by making the data collection phase
larger – a hundred times larger, or a thousand times larger, or (whisper it) a
million times larger. This is extremely unscientific – there won’t just be
diminishing returns, but negative returns. What is required is understanding
the meaning of words, so there is “understanding”, not just statistics.
When the
word “unprecedented” is being thrown around so casually, we need to think about
what we are doing, not fall back on something that promises “no need to think”.
The world is changing under our feet.
Word propinquity used in LLMs has
severe limitations.
Comments
Post a Comment