New artificial-intelligence (AI) products like ChatGPT have gotten a lot of attention in recent months (including here on this blog) for their ability to do things like writing articles and term papers, or even creating new programs. It’s been talked about as a momentous turning point in technology, equally celebrated and dreaded.
Yet ChatGPT’s visual equivalents, image-generation products such as Stable Diffusion, DALL-e and Midjourney, have so far gotten relatively less attention. That’s surprising because, if anything, AI images have the potential to be even more disruptive. Let’s take a look at how these products work, what they can do and what it might mean for you.
Machine Learning and AI Images
ChatGPT and its competitors, like Google’s Bard, are known as “large language models,” or LLMs. That means they’re trained on huge swathes of text, so they can “learn” and replicate how humans use language. The big players in the industry are vague about which source texts they’ve used, but they’ve included things like published books and articles, and plain old social media chitchat.
AI image-generation products apply the same basic technique to images, by analyzing billions of them to identify everything from the subject matter (coyote, peony, human child) to subjective descriptors (moody, beautiful, somber, tall). A fully-trained model can then use what it’s learned to generate new images (hence the term, “generative AI”), by combining all of those elements in different ways.
To create that image, you’d simply type in a text description of what’s wanted, which can be anything imaginable, such as:
- “Joseph Stalin in a ball gown, dancing with Fred Astaire”
- “A dachshund riding a very small tandem bicycle”
- “A shark’s fin in a child’s swimming pool”
You can then tweak the output until it’s exactly what you want. The end result can be impressively realistic, looking — at its best — like the work of a professional graphic designer (to be clear, graphic designers’ skills are hard to replicate, and most AI output will fall short of that mark).
So What’s the Problem With AI Images?
All of this sounds innocuous enough. It’s amusing, and will doubtless unleash a wave of creativity in people who have good visual ideas, but lack the skills (and the years of training they represent) to create images conventionally.
The problem is that humans are a pretty visual species, and we tend to put a lot of trust in what we see. How often have you heard people say things like:
- “Seeing is believing.”
- “I saw it with my own two eyes.”
- “A picture is worth a thousand words.”
- “The camera doesn’t lie.”
OK, people don’t say that last one very much anymore, because we all know the camera does lie. Faking photos used to rely on clumsy tricks in the darkroom (literal cutting and pasting), which were relatively easy to detect. The arrival of Photoshop and digital photography made fakery easier to do and harder to detect, and “deepfakes” in both still-image and video form are now a growing problem.
AI takes that problem to a whole new level by stripping away the barriers (costly software, skilled users) to the creation of fake imagery and putting it in the hands of anyone and everyone. Adobe’s own AI product, Firefly, can even be used by non-technical users to automate previously difficult tasks in Photoshop. Given our instinctive trust in what we see, this raises a whole host of issues with potentially troublesome outcomes. This is why Microsoft president Brad Smith has described deepfakes as his biggest concern about AI (Microsoft is a large player in the AI field).
How AI Images Can Be Problematic
Let’s take a look at one of the first AI-generated images to generate headlines, the now infamous faux-photo of the Pope in a puffy white jacket from Balenciaga. It was widely shared on social media, despite many signs that the image was faked. It was all in fun — the creator just thought it would be funny — and was originally shared on a subreddit explicitly devoted to AI art, with no intent to deceive.
Others can be more political, like the AI-generated fakes of Donald Trump being arrested or the one of Trump kneeling in prayer (which, in fact, the former president himself released). Some just straightforwardly seek to stir up trouble and confusion, like an image claiming to show an explosion at the Pentagon that briefly went viral in May 2023. While that one was quickly debunked, it did cause stock markets to dip while the rumor was spreading.
It’s a sobering reminder of the adage that “a lie can circle the world before the truth has its boots on.” There’s a very real risk of malicious (or misguided) use of the technology, with the deliberate intent of crashing markets, clouding policy debates, strengthening political divisions or targeting specific public figures. It’s something we need to be aware of and prepared for as a society.
How AI Imagery May Affect You Personally
You might well be intrigued by Stable Diffusion and its competitors, and they do hold a lot of promise for legitimate use. If you have a small business and don’t have the kind of budget that includes graphic designers, you can use these tools to create usable advertising and sales materials. It’s also a tremendously capable playground for anyone who’s artistically-inclined, but lacked the time or training to seriously entertain that kind of a career.
Yet there are several potential downsides to this technology, as well. You’ve probably seen stories of people who were “sextorted” by criminals after their racy private images were stolen. Well, now those images don’t even have to exist in real life, because it’s trivially easy for an ill-intentioned person to create them by mashing up a few photos you’ve shared on social media with the bottomless well of pornographic images available on the internet.
Similarly, divorcing spouses could hypothetically fake images of each other cheating or engaging in dubious behavior in order to influence negotiations or court proceedings around custody and division of property. There’s even a possibility that your own photos could end up becoming raw material for others to use within the imaging model, which is a whole other issue.
AI, Privacy and Intellectual Property
There are a couple of ways your own personal photos might end up (or might already have ended up) among the images used to train one or more of these emerging AI tools. Those images were scraped from a range of sites, from Pinterest and WordPress accounts to existing image-sharing sites such as Flickr and DeviantArt. Some of those sites explicitly license their images as freely available for reuse, but that’s not the case across all of them. So if you’ve ever hosted photos on Flickr, or shared pictures of your family on Pinterest, those may be among the data sets used to train the major AI models.
Another potential privacy issue arises from actually using the model yourself. The Terms of Service on these AI products typically gives them the right to continue refining and training their software by learning from its users’ work — which means your own efforts and your own images. If you’ve used one of these AI tools on pictures of yourself and your family, for whatever reason, you may have inadvertently put your images into the bank of material the app draws from. It could mean your face showing up in someone else’s artificially-generated image.
These models also make an Application Programming Interface (API) available so that third parties can use them to create their own applications. It means developers can come up with all manner of exciting new ways to leverage the underlying AI product (whether that be Stable Diffusion or DALL-E or Midjourney), but it also means each developer can apply its own Privacy Policy and Terms of Service. Not all of them will necessarily be completely ethical and aboveboard in how they handle your data.
A final risk — if you use an AI image product for business purposes — comes from their still-murky legal status. Many of the images used in training these products proved to be copyrighted, and their creators were neither paid nor credited for the use of their work. The thorny question of how intellectual property rights apply to AI-generated images will take years and much litigation to sort out, and in the interim there’s a remote but non-zero chance your own business might be faced with take-down notices or even a civil suit.
Is Stable Diffusion Safe?
So what’s the bottom line, if you’re intrigued (or maybe frightened) by this new technology? Well, there’s definitely potential for misuse. The ability to generate and animate AI images, and even to realistically fake your voice after capturing it from a recording, means anyone with a grudge could generate realistic images, videos or audio clips that appear to show you saying or doing things that are uncharacteristic, hateful, embarrassing or downright sexually explicit. And they’d be convincing. One high-profile tech site even suggested that it might be time to take down whatever photos you’ve posted online.
We’ve previously discussed the idea of making your social media private, wherever possible, but pulling down all of your photos isn’t easy or practical (how many have your friends and family members posted?). It may not even be necessary: Major industry players have a vested interest in making sure AI respects users’ privacy, and are introducing new measures to help identify their images and videos as AI-generated, through digital “watermarks” and embedded code that identifies them as AI creations.
There should soon be consumer-ready tweaks to your cell phone cameras and social-media apps that will let you proactively “immunize” your photos from misuse. Researchers at MIT, for example, have demonstrated one such technology that they’ve dubbed “PhotoGuard.” It’s not yet ready for widespread use, but it points to where the technology is headed.
These AI image-generation products certainly merit some wariness on your part, because of their potential for misuse, but unless you’re a prominent figure you probably don’t have a lot to worry about just yet. So go ahead and enjoy using these tools, if you want to. Just be wary about using them with your own photos.
Sources:
- Ars Technica – Adobe Photoshop’s New ‘Generative AI’ Tool Lets You Manipulate Photos with Text
- Reuters – Microsoft Chief Says Deep Fakes are Biggest AI Concern
- The Verge – The Swagged-Out Pope is an AI Fake – and an Early Glimpse of a New Reality
- Buzzfeed News – We Spoke to the Guy Who Created the Viral AI Image of the Pope that Fooled the World
- BBC News – Fake Trump Arrest Photos: How to Spot an AI-Generated Image
- The Washington Post – A Tweet About a Pentagon Explosion was Fake. It Still Went Viral.
- Waxy – Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator
- The Verge – Getty Images Sues AI Art Generator Stable Diffusion in the US for Copyright Infringement
- Ars Technica – Microsoft’s New AI can Simulate Anyone’s Voice With 3 Seconds of Audio
- Ars Technica – AI Image Generation Tech can Now Create Life-Wrecking Deepfakes with Ease
- Bloomberg – Google Launching Tools to Identify Misleading and AI Images
- MIT Gradient Science Blog – Raising the Cost of Malicious AI-Powered Image Editing