AnimeSuki Forum - View Single Post

Renegade334 · 2023-10-14, 16:52

Severe apologies for the multi-post, but there is a character limit for each post - and the prompts gobble up that allowance VERY quickly. Again, I apologize for the trouble I may have caused and the guidelines I may have unintentionally wrinkled.

Now, a few written notes, conclusions and opinions to get back into the mods' good graces and justify the pic-posting spree:

AI image generation is FAR from perfect. More often than not, especially if you're still feeling your way around, it'll take you a half-dozen tries (if not MUCH longer) to figure out what works and what doesn't in the prompts. It's easy to get what, at first sight, looks like a winner only to realize that there are flaws here and there that will have you go start SD back up and try again.
AI image generators are NOT chatbots and therefore lack the latter category's linguistic proficiencies. Instead, they prefer natural language that may sound janky to your ear, but this is actually how AIIGs dissect and conceptualize their compositions, e.g. "dramatic lighting, film grain, anime, 4K, wallpaper, extremely detailed, ink coloring, one boy, red hair, blue shirt, tan slacks, white shoes, casually walking down a street, wet cobblestone, cars parked, blue sky, sun, lens flare, water puddle, reflections". There is nevertheless a LOT of trial and error here, because, for example, the engine could create a picture where the street is completely flooded or is essentially a canal (saw it happen several times myself). While playing with the guidance scale (see glossary at end of post), you must also make allowances for the engine and realize that your bar might be set too high. You'll be compromising a lot in the end, and settling for the least bad or best-looking image you generated so far.
A lot of the image generation guarantees you a big deal of post-processing or Photoshopping. Like correcting fingers and whatnot. Or lazy eyes.
Hands, eyes and fingers are AI generators' weak spots, especially with Stable Diffusion. Such software has trouble properly forming these body parts UNLESS you add dedicated directives in the negative prompts (see glossary), like "fused fingers, bad hands, missing limbs, extra limbs, extra fingers". And even so, you can still get monstrosities - and, sometimes, you realize that less is more, that adding more interdicts in the negative prompt actually increases the chances of such aberrations showing up. It can be VERY frustrating.
Bigger image size often translates to better quality because the engine has more real estate to work with, more space to correct potential errors. Bigger image size also means higher resolution and more detail.
The hunt for bigger image sizes also drives the longing for better hardware and VRAM. There is only so much that launch parameters in Stable Diffusion's main batch file (webui-user.bat) can do. I've had countless crashes due to memory fragmentation or insufficient VRAM, but I was still able to generate some good pics.
The issue of copyright: curiously I wasn't able to find Harry Potter or Daniel Radcliffe character skin models (Loras), which makes me wonder whether the actor or Warner Bros or some agency/org/gov bureau put their foot down (cease&desist) to protect his image. I was still able to generate Daniel Radcliffe pictures, mind you, but the absence of a LoRA is puzzling given the actor's popularity and that of the Harry Potter fanbase. Anyway, this is an extremely gray area and I understand a lot of people may be uncomfortable with it for a variety of reasons (remember those AI pics of Donald Trump trying to evade arrest?), especially artists who dread to see randos on the Internet generating images on potato PCs, good enough to be passed off as genuine articles made by the said artists... For one, I know that Times Magazine is pissed at Stable Diffusion scanning their covers and other photographs for training material, and is current threatening legal action. And that's not counting well-known people who are seeing their image being used without their permission, more often than not to endorse stuff they might not be okay with, and possibly for worse goals (porn deepfakes). This is where there'll be much (legal) trouble to be had.
That said, there are some uploaders that DO specifically FORBID the use of their files for commercial purposes. Good.
As realistic as some models are, there are still easy giveaways that the pic isn't as authentic as it flaunts itself to be: objects that should have straight lines but do not, body horror (wrong body proportions, extra limbs, etc), badly modelled items (like a sniper rifle or gardening tool that looks like it was made from plastic and putty), perspective incongruencies, etc.

It's an interesting experience, I must say, but it's not the end-all, be-all. It still makes you do a lot of finagling afterwards. There is still a LONG way to go before we arrive to a point where genuine and 100% digitally fabricated pictures become indissociable/indiscernable from one another.

Also: I apologize if some of the prompts got mixed up. I've been alt-tabbing from Imgur to notepad++ and my Stable Diffusion folders to copy-paste links and descriptions and...I've had mishaps, which I've tried to correct as much as possible, but I fear there still might be accidental mismatches and I'm really tired right now.

TL;DR…

glossary and how to read the prompts/settings

Sorry; dynamic content not loaded. Reload?

2023-10-14, 16:52	Link #46
Renegade334 Sleepy Lurker Graphic Designer Join Date: Jul 2006 Location: Nun'yabiznehz Age: 38	Severe apologies for the multi-post, but there is a character limit for each post - and the prompts gobble up that allowance VERY quickly. Again, I apologize for the trouble I may have caused and the guidelines I may have unintentionally wrinkled. Now, a few written notes, conclusions and opinions to get back into the mods' good graces and justify the pic-posting spree: AI image generation is FAR from perfect. More often than not, especially if you're still feeling your way around, it'll take you a half-dozen tries (if not MUCH longer) to figure out what works and what doesn't in the prompts. It's easy to get what, at first sight, looks like a winner only to realize that there are flaws here and there that will have you go start SD back up and try again. AI image generators are NOT chatbots and therefore lack the latter category's linguistic proficiencies. Instead, they prefer natural language that may sound janky to your ear, but this is actually how AIIGs dissect and conceptualize their compositions, e.g. "dramatic lighting, film grain, anime, 4K, wallpaper, extremely detailed, ink coloring, one boy, red hair, blue shirt, tan slacks, white shoes, casually walking down a street, wet cobblestone, cars parked, blue sky, sun, lens flare, water puddle, reflections". There is nevertheless a LOT of trial and error here, because, for example, the engine could create a picture where the street is completely flooded or is essentially a canal (saw it happen several times myself). While playing with the guidance scale (see glossary at end of post), you must also make allowances for the engine and realize that your bar might be set too high. You'll be compromising a lot in the end, and settling for the least bad or best-looking image you generated so far. A lot of the image generation guarantees you a big deal of post-processing or Photoshopping. Like correcting fingers and whatnot. Or lazy eyes. Hands, eyes and fingers are AI generators' weak spots, especially with Stable Diffusion. Such software has trouble properly forming these body parts UNLESS you add dedicated directives in the negative prompts (see glossary), like "fused fingers, bad hands, missing limbs, extra limbs, extra fingers". And even so, you can still get monstrosities - and, sometimes, you realize that less is more, that adding more interdicts in the negative prompt actually increases the chances of such aberrations showing up. It can be VERY frustrating. Bigger image size often translates to better quality because the engine has more real estate to work with, more space to correct potential errors. Bigger image size also means higher resolution and more detail. The hunt for bigger image sizes also drives the longing for better hardware and VRAM. There is only so much that launch parameters in Stable Diffusion's main batch file (webui-user.bat) can do. I've had countless crashes due to memory fragmentation or insufficient VRAM, but I was still able to generate some good pics. The issue of copyright: curiously I wasn't able to find Harry Potter or Daniel Radcliffe character skin models (Loras), which makes me wonder whether the actor or Warner Bros or some agency/org/gov bureau put their foot down (cease&desist) to protect his image. I was still able to generate Daniel Radcliffe pictures, mind you, but the absence of a LoRA is puzzling given the actor's popularity and that of the Harry Potter fanbase. Anyway, this is an extremely gray area and I understand a lot of people may be uncomfortable with it for a variety of reasons (remember those AI pics of Donald Trump trying to evade arrest?), especially artists who dread to see randos on the Internet generating images on potato PCs, good enough to be passed off as genuine articles made by the said artists... For one, I know that Times Magazine is pissed at Stable Diffusion scanning their covers and other photographs for training material, and is current threatening legal action. And that's not counting well-known people who are seeing their image being used without their permission, more often than not to endorse stuff they might not be okay with, and possibly for worse goals (porn deepfakes). This is where there'll be much (legal) trouble to be had. That said, there are some uploaders that DO specifically FORBID the use of their files for commercial purposes. Good. As realistic as some models are, there are still easy giveaways that the pic isn't as authentic as it flaunts itself to be: objects that should have straight lines but do not, body horror (wrong body proportions, extra limbs, etc), badly modelled items (like a sniper rifle or gardening tool that looks like it was made from plastic and putty), perspective incongruencies, etc. It's an interesting experience, I must say, but it's not the end-all, be-all. It still makes you do a lot of finagling afterwards. There is still a LONG way to go before we arrive to a point where genuine and 100% digitally fabricated pictures become indissociable/indiscernable from one another. Also: I apologize if some of the prompts got mixed up. I've been alt-tabbing from Imgur to notepad++ and my Stable Diffusion folders to copy-paste links and descriptions and...I've had mishaps, which I've tried to correct as much as possible, but I fear there still might be accidental mismatches and I'm really tired right now. TL;DR… glossary and how to read the prompts/settings Sorry; dynamic content not loaded. Reload? __________________ << -- Click to enter my (dead) GFX thread. Last edited by Renegade334; 2023-10-16 at 13:59.