AnimeSuki Forum - View Single Post

Renegade334 · 2023-10-14, 16:18

Quote:

Originally Posted by scififan

I only know stable diffusion and chatgpt, but they cost money or have some restriction.

What is the step by step instruction for making anime with free AI tools, as an amateur?

What? Not exactly.

ChatGPT is free, though it does require you to register an account at https://chat.openai.com (phone number or email required). The problem is that a free account is restricted to ChatGPT 3.5 and there hasn't been an update to the knowledge base since September 2021. A paid subscription gives you access to ChatGPT 4.0 but IIRC there is a limit to how many queries you can make per day.

Stable Diffusion is totally free and open source and there are several distros/UIs like Automatic1111 and Easy Diffusion (which I'll call ED to avoid confusion) that you can pull off github if you know your way around the command line interface and Git. There are no restrictions, but there is a learning curve and an imperative to have relatively good hardware (above quad-core CPU, at least 8 or more gigs of RAM and at least 8 or more gigs of VRAM.

ED is most recommended for beginners and hobbyists (I also believe it has better hardware tolerance than full-fledged ED), as the UI is much more comprehensive than Automatic1111's, and it has better support for task queuing (which comes out of the box, whereas A1111 requires a third-party extension to be installed), but it lacks several features and capabilities (for example, it has a smaller sampler pool and lacks the ability to leverage the ESRGAN-type 4x-UltraSharp.pth upscaler, which is the best of them all, better even than R-ESRGAN 4x+Anime6B).

Automatic1111 offers more options than ED, but the UI can be a bit finicky. It nevertheless allows the addition of extensions - like OpenPose Editor, ControlNet, Regional Prompter (a must-have if you want to compose images with multiple character generated through different Loras and make sure they don't blend into one another...and become clones), etc, many of which you'll see as godsends in your quest to create more accurate and complex visual compositions. Automatic111 is also the one distro with the most available documentation lying around, especially if you want to edit the webui-user.bat file to optimize its launch settings (such as adding "--medvram" or "--xformers"). Basically, once you've gotten comfortable with ED, try Automatic1111; I made the switch two weeks ago and found that (warning: your mileage may vary!) A1111 usually produced better images.

If you're a sucker for punishment or really like to look under the hood, there's the node- and flowchart-based ComfyUI that's...well...very configurable and powerful (I've heard good things about its take on the refiner system), but a potential headache for people looking for quick image generation.

Both A1111 and ED typically need 10 gigs of SSD/HDD space, but you'll quickly find that the checkpoint models are the ones gobbling up a LOT of space and I mean, a LOT. My $:\Automatic1111\stable-diffusion-webui\models folder alone contains 133 gigabytes of checkpoint files. The checkpoint models trained on SD 1.5 (which leverages training pictures of 512x512px) weigh roughly 2Gb, but those trained on the brand new SDXL engine (which made the jump to 1024x1024px, though it still does accept 762x762px pics) weigh 6.46Gb. And this is not taking into account the Loras (which is what you use to impart the likeness of certain themes or characters onto items in your images), which are smaller (from 30mb to 200-300mb depending how well-trained it is).

There is another well-known AI image generator named Midjourney and the general consensus is that it makes more beautiful images than SD but...it's not open source, it's a paid subscription where...the more you pay, the more features you unlock and the more pictures you can generate per month. Urgh.

I know this should belong in the fanart section, but...oh, well, here - just have a few examples of what I was able to generate with Stable Diffusion. Prompts, negative prompts and other settings will be added in spoiler tags underneath along with a commentary on the strengths and downsides of Stable Diffusion generation. Also, to keep the moderator team happy, I have pruned out images judged too NSFW (and yes there are checkpoint models that do offer that option, while others are staunchly SFW). There is still a bit of skin shown (esp. beach bikini images ;-))

I apologize to the mods in advance if they feel this diverges too much from the topic of AI and veers into fanart territory.

Also, @scififan: I unfortunately don't have the patience or the in-depth know-how to make tutorials here for Stable Diffusion. The only thing I can say: lots of trial and error, and lots of Googling. And lots of patience.