Fair Go


Gary Marcus has criticised LLMs on the basis that they have the rules of chess in their data, but cannot play chess. Is this a fair comment?

The rules of chess explain the legal moves for the pieces, and a few other specialised moves, such as castling and en passant. The rules do not explain how to play chess – the start, the midgame, the endgame, the shifting strategy.

It is unlikely that a person could read about millions of possible games and learn how to play chess that way (while a machine can do that easily). Instead, they absorb strategy by losing, drawing, and eventually winning games – they learn to make moves mostly unconsciously, while reading about a limited set of gambits and manoeuvres. LLMs have no comparable mechanism. It might be thought that Machine Learning could do this, but ML consists of a programmer “training” a network of directed resistors, the network being completely inadequate to handle the tens of thousands of states involved in the process of a game. There is no ability for ML to extend its structure. But it is only an eight by eight array – where do the tens of thousands come from? A piece (not a pawn) can occupy any square on the board, and do so in company with other pieces, working toward a strategy, which may cease to be tenable if one piece is taken.

As training for military strategy, chess is very weak – the best military strategists break the rules (the classic example is the Maginot Line). The point remains – reading about chess being played at a high level will not help someone to play at a high level, other than to give a glimpse of the sorts of mental structures required.

How much is language like chess – do you need to speak or read a language to learn it? If a dictionary is the source of a machine’s knowledge about English, it has an obvious limitation in the brevity of its entries. How will a machine become competent in handling dense text?

By observing how a person breaks down complex text into objects that the machine uses – words, phrases, clauses. Sometimes there is insufficient information to be certain, so a decision is left until more information is available, or, if needs must, a decision is made based on whatever information is available (the machine can use its existing structure to simulate what the outcome might be, so the decision can be far from static).

No, we are not offering something right out of the box that knows everything it will ever need – new words appear, and the meanings of existing words change, for a start. What we are offering is something that can grow into whatever role is chosen for it. The important thing is it does not have the Four Pieces Limit, which plagues humanity.


Popular Posts