How Far Will Generative AI Take You Towards AGI?

 

It seems like a side road. At its heart, it is an indexing method for articles on the internet. Someone writes a fairly short, non-technical article on a subject. You type in a few words that point to that article, it returns the article.

First problem – popularity. If there are several or many matches, it will return the most popular article, not the most valid one – it has no idea about validity. This means that new, more valid ideas will struggle to break through – a return to the Dark Ages, where nothing changes. A good example was the idea that bacteria can cause ulcers. Doctors were strongly resistant to the notion – if they are never shown the notion, because it is less popular than the current view, it will never be accepted.

Second problem – the prompt may not be obvious, and it requires the skills of a “prompt engineer” to craft a suitable prompt. In a fast-moving technical field, how long will the prompt be valid. If the more esoteric topics need prompts, does this invalidate the hope of having a huge database of valuable information at your fingertips? Having someone change the question from the one you asked to one that they think you want has obvious pitfalls.

Third problem – “a picture is worth a thousand words”. If LLMa are restricted to text (which it doesn’t understand), how can the text be usefully linked to an explanatory image? And, given that words can have many meanings, how do we annotate the image in a way that is not confusing? Pointing to an area on the wing of an aircraft and saying “flap” leaves no doubt, but pointing to a picture of a metal box and saying “container” is not adequate if we mean an ISO container, and the box picture has not brought context with it. This is where the Unconscious Mind brings 20-40 years of observing objects to the fore, filling in details so a mountain isn’t left hanging in the air as a triangle because we didn’t show its base or its three-dimensionality.

AGI will require the specific meaning of words, and the ability to seamlessly integrate text, images and video, in the same way that our memory can dig up a small video of a memorable event in our past. An LLM-based solution doesn’t come within a bull’s roar of what is needed.

 

Comments

Popular Posts