I’ve read a number of articles lately that point out a potentially fatal flaw in large language models (LLMs) that drive the output of generative transformers like ChatGPT. And now that I think about it, it’s so obvious. Way, way back in the dawn of computing we had a saying: garbage in, garbage out, or GIGO. A computer program’s output can only be as good as the information it is given as input.
ChatGPT 3 and 4 are amazing – they exhibit surprising, lifelike “behavior”. They can do things you wouldn’t expect. They can produce perfect prose – essays, articles, technical and creative documents. But that perfect English language output is based on the only thing the software knows – the billions of documents used to train it, the LLM.
But what happens when some or many of those documents are slanted in a certain political or religious direction? Or some of those documents contain mistakes or outright lies? With the right inputs, ChatGPT could become a completely convincing Holocaust denier. Or flat-Earther. Or a very convincing pick-your-party political propagandist.
The problem now is that most people have no idea what they’re dealing with, and have no real interest in knowing. ChatGPT is the hot new tech thing, and it’s going to be everywhere. They just like the results of having a simple language interface to the mysterious world of computers. But they could well be conversing with a convincing liar, or a marginally-insane piece of software. This is a *serious* problem, not just for Joe Average, but for everyone. How do we keep powerful generative transformers from becoming powerful propaganda tools?
The genie is pretty much out of the bottle – I don’t think there’s any going back now. How do you fact check an entity that is faster, works 24×7, and has a much larger (though almost certainly flawed) knowledge base than you? Who will the public believe – an educated person or the all-powerful software machine?
Maybe, just maybe, we’ll find a way to curate and certify the inputs to generative transformers and be able to trust their output. But until we do that, no one should trust what they get from a ChatGPT-type machine. GIGO.