Once upon a time, there was a profession that thrived. Highly valued professionals provided essential services that nobody else could, and charged for them accordingly. Then, seemingly out of nowhere, something new appeared, and things changed. People who devoted many years to perfecting their craft found themselves limited to few, mostly mundane things. The value of their work plummeted. So did their numbers. Nowadays, you can only do it if you either branch out and have other sources of income (or rich parents) or you actually want to be that starving artist.
This profession is called blacksmithing.
Before the industrial revolution, a smithy was the centre of every town and village that had one. A good sword cost a fortune; a chainmail shirt required months of producing tiny rings, then linking them by hand with tiny rivets. Today, there’s very little need for blacksmiths. A few get jobs at movie sets. Some become specialised at producing expensive, intricate swords or knives for collectors. There are still farriers who shoe horses, although mostly the horseshoes come from China. Very few are actually so good and educated that in the blacksmithing world they’re basically Gods, and can amass, oooh, up to 0.001% followers on Instagram than a Kardashian can. But really, who cares if my gate is one out of 50 thousand identical ones, when it costs me $500 rather than $5000? It’s not like anybody can tell.
Look at this guy. Holy monopoly, I so would.
Unfortunately, he doesn’t exist. What exists is “wide portrait of a young tattooed man in Iceland on a rainy day, wearing open leather biker jacket, longhaired, bearded, blonde; muscular, handsome, resting on a tough day, profile picture, stormy seas, documentary, oscar winning; perfect face, anatomy, eyes; skin detail, wrinkles, 8k; sharpened; high resolution, denoise”
In the style of Greg Rutkowski
In October 2022 I started to see articles about AI stealing digital artists’ jobs. Someone called Greg Rutkowski complained about it in multiple interviews. I became very curious and googled those AI websites, and found one called Stable Diffusion Web.
I couldn’t stop obsessively using it over and over, because it was hilarious. I mean… don’t you just love this Viking warrior surrounded by fire in the style of Greg Rutkowski?
The monstrosities it created “in the style of Greg Rutkowski” made me declare digital artists really had nothing to be afraid of and that he definitely got a lot of publicity from that, so well done, Greg.
I said that on October 26, 2022. Which, as I’m writing this, is two months and eight days ago. Let’s generously round that up to 10 weeks. Boy, those words aged worse than Musk’s Twitter Blue.
Here’s a “Viking warrior surrounded by fire” I generated earlier today on my own Macbook Air. (The prompt is very long, honed for days, so is the negative prompt, an option that didn’t exist when I produced the art above, but yes, the words “Greg Rutkowski” feature in it.) The word “earlier” is important.
Ten weeks later.
I’m not chuckling anymore. The lighting isn’t great (but this was earlier today, remember), the forearms and the flames don’t quite work together, it’s still not exactly what I want, but I am really not chuckling anymore at all.
Artists are screwed sideways with a chainsaw
As an indie fantasy author, I originally wanted to commission an illustration for the cover of Children. I looked around and found out that good artists 1) are booked for many months in advance, 2) their prices begin at $1000 if you don’t ask for too much detail. They’re popular, pricey, and busy for a reason: they’re that good, and it did not happen overnight thanks to a magical talent-sprinkling fairy.
Digital artists don’t just create fantasy book covers, of course. They work in gaming industry, both for computer and card/board games. There are graphic novels, record covers… That’s just off the top of my head (I have never paid a lot of attention to what digital artists actually do until a few months ago…) I didn’t pay $1000, partly because I didn’t have $1000, settling for a composition of stock images I put together/filtered/layered/transformed myself. If you don’t count the cost of Photoshop subscription and a few weeks of my own work I think it cost me about $100. It’s not what I originally wanted, but I’m happy with it. I have also worked as a graphic designer for over 15 years, studying my craft, practicing, learning.
Those artists have studied their craft for years, practiced, and learned too. I used to be – past tense – one of the best designers in Europe. Since my burnout ten years ago I haven’t done much apart from my own book covers and banners, but I still know what to do in order to achieve the exact result I want (unless it’s ice – you’ll see later). Similarly, those artists are in demand, because they know what to do and how to do it. A part of that $1000 is the price of the years or decades it took them to get where they are professionally.
Then, suddenly, there is a piece of software that renders (sic) them obsolete. They’re being ripped off, too. The algorithm doesn’t actually enter Greg’s brain while he is asleep, cause him to create a masterpiece and upload it online together with an invoice. The algorithms are not “artists.” They use databases of images “found” on the Internet. Rutkowski has a large portfolio to steal, I mean – learn from. My subscription to Deep Dream Generator costs $19.99 a month. Every time I produce another image Rutkowski receives $0. And so does Rembrandt, Randy Vargas, Van Gogh, Tom Bagshaw, who apparently collaborate these days (for free!) once you add all their names to the prompt – the description of what you want.
The implications of this are vast, both ethical and legal ones. First – who is the author of my Viking warrior image above that I created earlier today? Second – who is its owner?
The Blue King
We’re going very, very far back in time (one month), to December 3, 2022.
I have a very specific vision for the cover of my sixth book (I have published three so far and I’m nowhere near finishing the fourth, but apart from that). A Black man with an ice crown on his head, sitting on a broken wooden throne, crying tears of ice, in, you might have guessed, Iceland. Here’s what DALL-E gave me.
That wasn’t exactly it. Especially the ice bit. And also the crown. But apart from that, I had the pose, which I then ran through another AI website, Deep Dream Generator.
The throne changed into fur and I suddenly discovered I loved that much better. While you might notice his hands are slightly off, just a tad, he has a rather realistic face. (It helps that his eyes are shut.) I used Photoshop to get rid of the warm colours, then ran it through DDG two or three more times.
Later I added layers of ice to obscure the, uh, extremities, used filters to change his facial expression, placed an ice tear in the corner of his right eye, did the typography, logo, etc. Here is The Blue King:
There are a few things that don’t look the way I’d like them to, for instance the ice – turns out that you can be a designer for 15 years and struggle to create something looking vaguely like ice. This, however, is THE cover. This is what I want. I would never be able to explain it with words or draw a sketch myself. The only changes I need from an actual artist is to fix the hands and make the ice look realistic. Which isn’t completely unlike blacksmithing being reduced to shoeing horses.
But.
Here is my Deep Dream Generator prompt: “hyper-detailed hypermaximalist epic fantasy digital art; accurate anatomy and eyes; weary grizzled bearded scarred blackskinned king in iron crown, leather boots, fur cloak, sitting on frozen ice throne; frozen lake, frost, glacier mountains in the background; trending on artstation; oil on canvas; fantasy book cover; greg rutkowski, randy vargas, diego gisbert” I don’t remember the DALL-E prompt that gave me the original crayon drawing.
Suppose I now ask another artist, Jane Smith, to paint exactly this for me, just fix the ice and the hands. Now it’s no longer AI-generated, right? But is it original work or plagiarism? I don’t want something in the style of, I want this exact image with some touch-ups. If Jane agrees to do it and I pay her $1000, does she contact Greg, Randy, and Diego to ask how much of their style, percentage-wise, they see in this image, so that she can pay them a percentage of that $1000? How much of it should go to artists whose works trended on Artstation on December 2, 2022 around 10pm CET?
I’ve read an article suggesting that if I work with long, elaborate prompts, adjusting them, that means I am now a creative, an artist working with tools. Who defines “long, elaborate, creative” though? Should it be a particular number of words? Versions? I’ve spent a lot of time with Photoshop, surely I deserve some credit? Or should the prompts and generators get credited? If I put “illustration by Jane Smith” on the copyright page, is that really true? Deep Dream Generator’s terms and conditions are very vague about the question of ownership and can be summed up with “we’re not responsible for anything.” Can I sell this image?
Some do.
Fractal Noise
Not long ago, the cover for Fractal Noise by Christopher Paolini was revealed by its publisher, Tor. It didn’t take The Internet long to figure out the illustration wasn’t human-made. (Mind, that was over a month ago. The Internet would need a bit more time now.)
Tor’s defence makes sense and doesn’t. They didn’t know it was made by AI when they bought this image from “a reputable stock house.” It’s hard to believe that having to add the other leg didn’t alert the uncredited “house designer” that maybe this wasn’t actually human-created artwork, but sure, Jan. “Due to production constraints, we decided to move on with our current cover […] Tor Publishing Group has championed creators in the SFF community and will continue to do so.” Some called this tweet “a weak apology,” but nowhere does it feature the words “sorry” or “apologise,” nor does it say this won’t happen again. There is no legal definition of “championing creators.”
But who is really at fault here? Tor bought a stock image from a “reputable stock house.” Legal. The stock house bought the illustration from a person named Ufuk Kaya. Unless that stock house explicitly states that AI-generated imagery is not allowed there, that is also legal. And now we enter the grey zone of the question whether Kaya is the owner of this image. What was entered in the prompt? Did it feature any artists’ names? Did he repaint something before selling it?
A member of Reddit’s forum (subreddit) /r/stablediffusion used a SD database created by another Redditor to generate illustrations for his game. When he moved to a gaming subreddit to show his game to people, he was yelled at for using AI illustrations. He could never afford to commission them all, even if he got a 50% discount and only had to pay $500 for each. Interestingly, this is the same argument “pirates” use to explain why they won’t pay $3 for a book. But that book already exists. Someone, say – me, spent lots of time and had lots of nervous breakdowns writing it. Those illustrations hadn’t existed before the Redditor generated them. But elements of them had.
It could be said that there is no such thing as 100% original art. My own writing is influenced by Michael Cunningham, Marian Keyes, Terry Pratchett, Joanna Chmielewska. In Why Odin Drinks I use lots of 80s pop culture references and the sequel features Queen Taylor – and as many Easter eggs (Taylor Swift’s song titles) as I can cram in without getting sued. Can I really say, though, that I am the sole creator of this 100% original work? I made up a few words, so those are definitely 100% mine. I’m also a human, though, so I deserve compensation (although “pirates” would disagree) for my work. However, I am also a human when I keep tweaking my prompt and switching models to get the just right image of my Viking surrounded by flames – like I got the just right image of The Blue King. I need to find the right combination of words in the right order, i.e. the prompt, and the right model.
Model headaches
Stability.ai, the creators of Stable Diffusion – the best known generator – have announced model (version) 2.0 not long ago, following it almost immediately with 2.1, because in their attempts to avoid anything that could lead to producing pornography they accidentally made it nearly impossible to produce an anatomically correct, fully clad human being in possession of a non-scrambled face.
Those models are trained on databases. A database is a massive collection of images. Stability.ai’s code is open-source, meaning that anybody can modify it and train it on their selection of photos. Unsurprisingly, there are already models “specialised” in porn. I’ve only seen naked adult women so far – I won’t link to the gallery (you can probably find it at /r/stablediffusion) but they’re… photographs. Child porn models either already exist, or will appear within weeks. Or days.
I have been playing with Redditors’ altered models. There is one that creates manga, another – pencil drawings, etc. The one thing they share is that the resulting images are small – 512×512 pixels, or for version 2.1 – 768×768. However, in my quest for the perfectly lit Viking I created one that I almost liked, then googled “upscaler,” doubled the resolution, put the larger image through “face restoration” and this happened:
He’s gorgeous, the quality is shockingly good, and it’s not what I want to see.
Where with The Blue King I got what I wanted and now that poses a problem, I don’t need this Viking for any particular purposes. All I want is for him to be illuminated only by a bonfire. (He’s supposed to be sitting by it, but even I have difficulties figuring out the logistics, since I want to see the face, the body, and the fire in front of him. Also, apparently “sitting” is not the word you use to make the model sit down.) I don’t want a bright studio light or daylight through trees or even a cloudy evening sky. Night and fire.
It has taken me a few days to find out (from a Reddit post, obviously) that in order to get rid of daylight Stable Diffusion 1.5 requires the exact phrase “outside at night” – not “starry night” or “midnight” or “dark night,” etc. But I already moved on from Stable Diffusion to something called Protogen V2.2, which was uploaded to the model database this morning. You remember when I said “earlier” in the beginning of this post? This image was generated during me writing the post.
He has too many fingers and that left leg is somewhat… long, but I finally got the result I wanted, which is a man at night with the only light coming from the bonfire. He’s even almost sitting! I can now boost the resolution four times, use Photoshop to remove the extra finger and shorten the leg, and run it through Deep Dream Generator, producing the perfect result. Gods know it’s taken long enough! (Two days. It’s taken two days.)
But wait. Turns out that there is also Photogen 3.4… download, install, tweak prompt, and so… this is how the quality evolved over the last ten weeks.
What will I be able to render on my little MacBook Air another month from now?!
Maybe a godsdamn anvil
Here are a few examples of what AI thinks an anvil looks like. The prompt for every single one was “photorealistic blacksmith’s anvil.” (That fairytale village is also a “photorealistic blacksmith’s anvil.”) There seems to be no combination of words that can produce a blacksmith’s hammer, an anvil, the process of forging, or the inside of the smithy. Nobody trained a model yet to do that. I will admit it’s quite a specific request and there is a chance that more people want to see naked women than anvils, but still…
BUT.
Using a model released yesterday I suddenly produced this. The first blacksmith was supposed to be holding a sledgehammer. The second – to stand outside his smithy.
Anvil or not, sledgehammer or not, those are very clearly blacksmiths wearing leather aprons. You could even argue that the first one is simply using a gas torch (this is not what a gas torch looks like).
In part two…
…of this already disturbingly long post I’ll try to cover more problems that the AI generators pose – for authors, musicians, celebrities, politicians, photographers, and you personally. By the time I get to writing it we’ll probably live in a completely different world. One where digital artists without much experience get paid $10 to remove an extra finger or shorten a leg, and some really rich authors/companies/publishers boast that they were so invested in their product they actually splashed out on a non-AI image.