Why AI Search Is a Nightmare

Tech news is on fire with talks about how Microsoft plans to integrate ChatGPT into Bing. In response, Google has announced Bard. On the surface, these appear to be quite revolutionary and a step into the future of computing. Dig a little deeper and you unravel a different story.

Let me clarify, I’m not a Luddite… I just expect this whole thing is going to turn into a nightmare (though not necessarily for the same reasons as a lot of skeptics). “AI” already feels a bit like a bubble. While huge advances have been made into building bigger and better models (of which ChatGPT is currently the king [for public usage]), they haven’t really gotten “smarter”. At the end of the day, you still have a glorified Markhov chain generator for most generative text systems.

Let’s go over what ChatGPT is, how the whole thing is going to turn into a nightmare, and then why.

What Is ChatGPT?

ChatGPT is, as of writing, the “smartest” generative language model (which is public). It is based on GTP3.5 from OpenAI and an iterative improvement over GPT3 which can produce some amazing content. These truly are amazing models, but like anything with generative AI, there are caveats (as we’ll see later).

ChatGPT is an evolution over GPT3.5 using reinforcement learning and supervised learning. Reinforcement Learning from Human Feedback (RLHF) is also used heavily. Basically, humans play along and feed into the model by giving it feedback which it uses to learn. The model tells you “Today is Wednesday,” and the trainer may correct it with “No, today is Thursday,” or similar (though that would cause issues, as we’ll see later).

What makes ChatGPT so impressive and influential is its raw scope and power. ChatGPT leverages GPT-3.5, which is a step up from GPT-3 which uses roughly 175 billion parameters. In comparison, GPT-2 has roughly 1% of that at 1.5 billion parameters. Basically, the base language model takes massive troves of text data, chomps it, and learns how to predict the next word (or token) in a series.

All of this sounds great, but it hasn’t touched on the big issue with AI and machine learning; it isn’t smart.

The Looming Nightmare

While the technology is slick, the methods are incredible from a technical perspective, natural language processing (NLP) still lacks anything actually resembling understanding. It doesn’t understand what you asked, it merely calculates what’s likely to follow in the sequence. This leads to a lack of context where a simple request can turn into a completely incorrect statement. Combine these factors with the fact it’s so impressive and almost lifelike in some responses that people will grow to trust it without vetting it.

We already see droves of people blindly following trends or (literal) fake news on social media and similar. Once people get used to this supposed “human level artificial intelligence”, and it is accurate enough, they’re going to trust it (unless it’s patently wrong). That’s going to lead to some disasters since people tend to vastly overestimate how good they are at assessing the truth or how knowledgeable they are and how good technology is.

The technology itself isn’t the threat, how people use it (or will use it) is. ChatGPT is already making the news for hackers using it to produce malware and similar. Phishing attempts, fraudulent emails, etc. are all easily available with tools like ChatGPT.

How It Causes Issues

Again, the technology itself isn’t really the problem, but how available it is and how it’s used is. People are going to blindly trust the AI (which hasn’t been wrong yet!) when it tells them to drink bleach to cure a cold or something equally absurd. It sounds confident and it sounds right when it drops the wildest claims. GPT-2 did similar, but it was never good enough to just fire and forget. Newer machine learning platforms are literally magnitudes of order more advanced with raw capacity (though this doesn’t necessarily mean a linear level of advancement).

Most people use search engines as a second brain of sorts. You don’t need to memorize or fully learn certain facts or concept, you just search when you need them. This tendency combined with the limitations of the technology are going to lead to disastrous results for a wide swatches of people. Ignorance and stupidity are going to make this a complete and utter nightmare.

Lack of Understanding

The supposed AI doesn’t know what’s true or false, it just “knows” that the statistically likely answer (based on its model) is what it spits out. Whether that passes a common sense test or not is immaterial to modern machine learning.

Plenty of people believe AI has some level of knowledge or understanding, but it really doesn’t at all (in any conventional sense at least). It can map certain concepts and cluster them, but there is no real understanding.

AI doesn’t understand what it trains on; it finds patterns. You’re not replicating a person, you’re making a more charming Markov Chain generator (which seems to nail it), but nothing is going on behind the scenes (aside from some very fancy math). There is absolutely no actual understanding (in a human sense) for the whole process.

You get rudimentary concept mapping and ability to regurgitate and summarize things from the training sequences and similar (or toss out some complete trash), but you don’t get a program which actually knows what grass is, or why you can’t drink lava. The training hammers out a lot of these less ideal concepts, but you still get sinister sequences from it not knowing the difference behind satirical fiction and scientific fact.

Lack of Context

Modern AIs don’t have any kind of concept of context in a way which is meaningful. The AI already doesn’t understand concepts; it’s able to stitch them together and associate and cluster concepts and keywords. Response will range from dead on, to parroting parody or jokes as fact (with enough diving). ChatGPT has a lot added onto the traditional LLM (large language model) architecture, but you get a lot of things like this.

Things like ChatGPT can give off the aura of context and keep some level of internal context, but it is unable to put the world itself into context. Some statements may display logic, but the logic is from linguistic structuring and statistics, not an actual thought.

The danger is, without enough context, AI sounds believable and intelligent. LLMs are impressive, but they’re a blurry image of the zeitgeist. This is the macro image of humanity, but blurry.

Lack of Common Sense

This isn’t a sleight against the AI, but against people in general. People largely overtrust intelligent technology. AI is no exception. AI is a bubble and people are overselling it.

When people search, they expect to find results which are sorted by how factual or accurate they’re expected to be. Modern SEO tends to be a barrier to entry for lower quality content. People don’t click trash content or stay on useless sites intentionally. People have gotten used to this level of filtering.

Content farms and spam sites tend to only last a while, and they have to rely on either generating or stealing content to stay listed. It’s not a perfect system, but the majority of top search results tend to be relevant to what was searched. How does modern machine learning tell between complete trash and a usable site?

With each vendor pushing their own AI tool for search, we’re going to see bad results up top, where the right content was. People are going to get lost in the mess.

Breaking the Model

The technology is disruptive (but not in the financial market sense). It’s going to break the standard internet model.

A chat based AI search is going to make it near impossible to compete for clicks. People will trust the AI which means that it gets harder to compete for attention for lesser known content. This translates to a lack of ad revenue and similar which will push more and more people out of pumping out content. There is very little point to write when there is no reward unless you are so inclined.

The feedback loop for content is broken. AI search will (poorly) summarize results and content which doesn’t conform will dissappear.

AI assistant summarizing searches will break most content models in a way which they won’t really come back from. This isn’t a hiccup; it’s a heart attack. Sure, content farms and similar will survive, but the margins will get slimmer, and we’ll get more big players and more sleazy factories. They’ll eventually pollute the training behind the AI since it will run out of sane content to train on (or train on other AI content).

Resource Cost

Another issue with ChatGPT and similar AI based systems is the sheer cost to run and maintain them. The smarted the model, the higher the resource utilization. OpenAI’s ChatGPT may cost around $100,000 a day to continue running. Per that article, At least eight GPUs are in use to operate on a single ChatGPT. How is that going to scale with potentially billions of users?

As I discussed in my own article on GPT (here), resource usage for even the “dumber” models is pretty high (when you take into account scale). Barring serious technological advancements, AI and machine learning are just going to get more resource intensive. Every new parameter adds to the cost to train and operate.

Conclusion

AI and machine learning are amazing tools, but like any tool, they have a purpose and a use. Tossing something which can parrot information or statements (equally weighted) without understanding or capacity to have context is going to make AI search a complete nightmare (until the technology matures). We’re getting a glimpse of a future, but there’s a long, long way to go before it will make sense.

Make no mistake, this is the technology of the future; it’s just in its infancy. Search with machine learning and AI is going to be the future, but there is a ride to get there. Technology lags behind the optimism. People thing machine learning and AI is far past where it actually is, and it’s creating a bubble.

The current initiatives are over-hyped. They over promise and under deliver. We get a glimpse but not a taste of the future and that’s what’s most damaging.

It’s easy to adapt when things get easier, it’s harder when you have to move backwards after making up your mind. People will buy into the technology and realize it doesn’t work. It’s smart enough to fool some, but not smart enough to actually deliver. The nightmare isn’t the technology; it’s the failure of society as a whole to adapt to it. I’d love to be wrong, but the cracks are already showing.

AI is the future, but we’re not there yet.

Image by Gerd Altmann from Pixabay

Categories: Tech+
Some Dude: