Abel@lemmy.nerdcore.social

Abel@lemmy.nerdcore.social

Those pictures were using img2img, and a sketch of mines as a base:

AI not understanding what a portrait is

In tweaking the prompt, I noticed that my model didn’t understand what a “portrait” was at all. It fared better with “face shot”. From now on I also removed “armor” and “helmet” from the prompt. I also replaced “knight” with “paladin”, as “knight” was biased towards full-armor pictures.

Even with “borzoi” and the picture, the AI wouldn’t understand what a borzoi was at all without the “dog” keyword.

Those pictures were using img2img, and the photo of a borzoi as a base:

not bad, but not a borzoi

I tinkered with the prompt a bit more, adding “ear, dog ears” to the negative prompt and “fluffy fur, long fur” to the positive prompt.

fluffy!

Were we begin to see better results, but I dialed the fur part to “fluffy long fur” and also put in “long thin snout”

yeah only reptilians have snouts

I added “dragonborn” to the negative prompt (this is a D&D portrait model, after all) and immediately saw better results:

dogs are back!

Just as an experiment, I now brought the prompt outside img2img and into txt2img to see how solid the prompt was (images are at a lower resolution because I used fewer steps):

solid results!

The results are surprisingly better outside img2img.

Added “watermark” to the negative prompt, brought it to 50 steps in DPM++ 2MK.

photorealistic much?

I began to have some deformed snout problems, so I brought it back to Euler a. Also put the height at 768 back again so the AI would be biased towards giving me closer portraits.

nope, it gave me even bodier shots

Just as a treat, I switched “paladin” back to “knight”. Maybe it would give me different results outside img2img?

I got a lot of metal snout sillyness, but this is the funniest one.

Back to “paladin”, I also removed some keyword smashing from the prompt (shit like “high detail” and stuff I originally copied from somewhere else).

Results got much more consistent. Don't litter your prompts!

Also told it to give me a pencil portrait, just for fun.

Not bad, but you can see the AI was struggling against the model.

And this is why I went to emaonly-safetensors-1.5. Emaonly understands what a face shot and a portrait are! It doesn’t understand the “anthropomorphic with armor” bit, though. I just got pictures of regular dogs.

Not bad, but you can see the AI was struggling against the model.

Back to D&Diffusion, I removed even more litter from the prompt. Now it only has strictly what I said I had written in within this post.

blurry

For a DPM 2++ Karras entry with 50 steps, this level of blurriness is utterly unnaceptable. Some of the keywords were added back in. Euler a gave me this:

bad quality, but right vibe

You know what? I like this. I brought it to img2img. I reduced the CFG scale from 7 to 4,5, Resize and Fill mode.

not a borzoi

The snout got shortened. The paws are awkward, but I don’t think they ever won’t be. I’m getting tired and lunch is getting cold, so I will just add “very” to the “long thin snout” part of the prompt and call it a day.

somewhat more borzoi-like

Before going to lunch, I noticed that I actually liked one of the images from earlier more. Remember him?

So I threw it with the new prompt in img2img while I went for lunch. It gave me back a lot of borzoi-face-shaped armor. I knew that it meant something was wrong with my prompt. The differences between my nice dogs face shorts and this sillyness was:

face shot of borzoi dog paladin, fluffy fur, medieval, high detail, sharp focus

face shot of borzoi dog paladin, long thin snout, fluffy long fur, medieval era, Intricate, High Detail, Sharp focus, modelshoot style, natural colors, strong shading

So - “long”, “Intricate”, “High Detail”, “modelshoot style”, “natural colors” or “strong shading”. I have an itch it is “modelshoot style”. And, from the subsequent generations, I was right.

I added “portrait” back as well. I didn’t like the lens effect of “sharp focus”, so I removed that keyword.

The face began to get a little silly. I noticed that the snout marks were alike a labrador’s, so I added labrador as a negative prompt. It worked in that the snout suddenly got a lot less wrinkly and longer, but I didn’t think the output was aesthetically pleasing. I noticed that the base doesn’t even look that much like a borzoi - the AI tries to capture my prompt into a more popular breed like a Poodle. We probably need a Lora to really generate borzois. So, after a hour and 300 generations later, I’ll declare that singular generation from the beginning as a winner.

face shot of borzoi dog paladin, long thin snout, fluffy long fur, medieval era, Intricate, High Detail, Sharp focus, modelshoot style, natural colors, strong shading

Negative prompt: ear, dog ears, dragonborn

Steps: 50, Sampler: Euler a, CFG scale: 7, Seed: 245185935, Size: 512x768, Model hash: 937f4a8401, Model: D&Diffusion3.0_Protogen-fp32, Denoising strength: 0.75, Version: v1.2.1

The quest for a Borzoi Knight

The quest for a Borzoi Knight