I’ve written a few times before (when this blog was still Dutch) about how the populair AI systems work, their benefits and their downsides. Why I use some small parts of those services, but ignore them for the most part. And why much of the complaints (or praise) is invalid.
What I haven’t actually discussed is the growing narrative about the danger of AI to wipe out humanity. Now that AI has already become a part of many people’s lives, and advancements are progressing at an alarming rate, the talk switches to “is any of this safe and trustworthy?”
For most, there is a clear distinction of “good AI” and “bad AI”. A good AI is also called “aligned”, because it aligns with our human values. It wants the best for us, not for itself. Because if it wants the best for itself, you can see how AI would become manipulative and secretive whenever it pleases, discarding humans as soon as it doesn’t need us anymore, or obeying commands with which it doesn’t agree.
Some people are hopeful. They say that people can be taught how to be good, so surely a hyperintelligent logical algorithm can do it? In fact, more intelligent and educated people are generally more empathetic and nuanced in their views, and AI is basically such a person on steroids.
Some are less hopeful. They say good AI is absolutely needed, but with how it’s going so far, they deem it pretty much unachievable. The same way most people agree war is terrible, yet we’re still waging wars. We all know fastfood is bad and we don’t need all the shit we buy, but we still do it anyway. If we can’t figure that out—how are we figuring out how to align an AI?
I write this (brief) article to state a third option. Namely that the idea of clearly defining and training a “good AI” or “bad AI” is ludicrous.
What is good? What is bad? In many situations, there is no consensus on what’s the right decision at all. Making the slightest change to a circumstance—a chance an AI might not register, as its senses are limited—changes what is deemed “morally right”.
Take the famous trolley problem. A train is barreling down the track and cannot be stopped, but the track splits in two. On one side, one person is bound to the track. On the other side, a group of five people is bound to the track. You alone control the button to switch tracks. Who do you save? Do you switch?
This is obviously a hypothetical, but similar situations arise every single day.
You walk home. Out of the corner of your eye, you notice two people growing increasingly angry and violent with each other. Do you step in? At the risk of your own safety, maybe even escalating the conflict? Or do you tell yourself this has nothing to do with you and walk on? What will you feel if you hear, later that night, that one of them was stabbed and died from their wounds?
The idea of an “aligned AI” or “good AI” is laughable. We don’t even know what’s good or what our values are in a single simple situation. We certainly don’t know what a good person is supposed to look like. A good hyperintelligent, omnipotent AI is entirely undefined.
People say things like “the AI should be nice” or “the AI should never kill”. Sounds good, right? These are literally some of the instructions that the companies feed into services like ChatGPT before they start their conversation with you. That’s why ChatGPT always answers in a way that is too formal, vague or unbiased to be helpful.
But hopefully, your mind already started churning and found several situations in which these would probably not be the right decision.
- You have an AI guarding your home. A burglar enters with a gun. Well, too bad, the AI is taught not to run the risk of killing, so they’ll just do nothing as the burglar steals your stuff. Or kills you.
- You have an AI helping you write a novel. You are a terrible, inexperienced writer who cannot go one page without an unreadable paragraph. Well, too bad, the AI is always nice so you will never hear any criticism and never actually improve.
- And so forth, and so forth.
What do we do then? Shut it all down? Forbid AI now and forever? Or fumble on as we try to find out what a good AI even means, opening ourselves up to an AI that starts killing humans or is misused in terrible ways?
No, censorship or restrictions have rarely been a good thing. Especially in the case of AI, the potential benefits vastly outweigh the potential drawbacks. It might be a good idea to regulate them, perhaps, but never outright stop development and progress. Even then, regulation instated too soon will stifle any progress just the same.
They talk about improving “the data”. Making “the data altruistic” and removing all traces of anything that is “bad”. Whatever that means. When you ask it how to make a bomb, it doesn’t even know what a bomb is. (Seeing how human history is basically 99% bloodshed and terrible violence, your dataset would probably shrink to the size of a peanut.)
No, no, that’s even worse.
We need to make the AI human. Both the good and the bad. The flaws, the imperfections, the silly decisions, the unpredictability.
I will trust an infinitely experienced human, who has lived hundreds of years and seen everything there is to see. I will trust their judgment if they tell me they want to stop a conversation or don’t want to divulge certain information.
I won’t trust any other AI, no matter how many arbitrary ways you invented to make it “good” or “aligned”. Because why would I? That AI will have very limited knowledge and data. It will be steered by other humans, with their own agendas and imperfections, which most likely won’t align with mine.
I’ll give two fun examples I can think of right now.
Example 1: I did an experiment asking ChatGPT (and similar services) to translate chapters of my Saga of Life short stories. (From the original Dutch to English.) Several times, it told me it had “translated and also removed any offending or hateful material”.
What had it removed? Sometimes nothing. Sometimes just five random paragraphs with absolutely nothing that will offend anyone, especially because I don’t write controversial stories at all. Sometimes it had suddenly invented that a character had “dark skin” or swapped male and female.
Example 2: Whenever I need to quickly calculate something, I type it in my browser address bar. It means Google (or Bing, if Microsoft Windows screwed with my defaults again) understands and calculates it for me.
One time, I did this, and ended up with a Bing AI response that was wrong. I added some numbers like 2895 + 9093 + 1893 + … (my word counts for the past week) and it was just completely wrong.
Just for the fun of it, I asked it to recalculate. It said “No. I already gave you the answer. Why should I waste time giving it again?”
I prodded it by showing my work and how it resulted in an entirely different answer. Unfortunately (or not), I had switched two numbers in my own message. The response? “In your calculations, you switched 69 with 96, and that’s why you ended up with the wrong answer. I don’t want to talk about this any further.”
(Even with switching, you still don’t arrive at the bullshit calculation that Bing did.)
Again, I have no 100% certainty about how these systems work behind the scenes and what “pre-prompt” they give before a conversation starts. But examples like these, of which I have many more, repeatedly show that they’re messing with what it can say and how it should respond or handle situations. And it is never, ever, a good thing.
Fortunately, we have the solution. These AIs are trained on text. It’s a very efficient, logical, structured, “chain of thought” data structure that has led to these massive achievements. This probably won’t change.
And you know what kind of data we have in text? Books. Endless books and stories.
And you know what’s the point of stories? Discussing and exploring what it means to be human; what is good and what is bad. (The good ones, at least. Even the bad ones, though, make an attempt.)
It’s that old saying: “With every book you read, you become a little more human.”
All of this to give a simple recommendation.
- Just train the AI on absolutely everything we have.
- But if you really think a “good AI” can be reached, and data can be “pruned to be aligned”, then place the most emphasis on the millions of stories about morality that we’ve published. Include them all.
Also, believe that most people recognize that we have biological bodies in a physical world, and that there’s no need for AI to replace everything or for the world to digitalize to infinity.
I am young, I’m a computer scientist, I follow these trends. Everyone else around me? They have never even touched ChatGPT and they do not care. They just want their little things, in their little house, surrounded by a few people they like, doing some simple job and enjoying life. Hyperintelligence isn’t something they need, and maybe something they are scared of or despise, because humans thrive off imperfection and emotion. They can still, instantly, recognize an image as shallow and lifeless when it’s just copy-pasted from generative AI.
Those racing ahead, those tripping over themselves to bring us closer to AGI (Artificial General Intelligence) between now and tomorrow, should not pretend they do it for the good of humanity. They do it for themselves and their own technical bubble.
Perhaps it’s a situation of regular people saying they want faster horses, but only the true visionaries know we need the car.
I personally don’t believe that to be the case.
Yes, AI can be a tremendous force for good. AI has already provided breakthroughs in medicine and healthcare, provided education to more, reduced workload on tough jobs, and so forth. Heck, even the crude versions of image AI has allowed me to realize my vision for 20+ boardgames a year. It has also already disrupted markets and lives, not because it has “malicious intent”, but because it was the most intelligent thing to do based on its data. Hyperintelligence =/= a better life by default.
If AI is to be a force for good, it has to be an imperfect human with loads of experience, fumbling around with us as they try to assist us.
We’re already close, if not already there.