Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

RandomPerchanceUser@lemmy.world · 18 hours ago

Thats true! Since nat lang text encoders are more complex , the negatives of the stuff being encoded is rarely the opposite.

As in , the model itself (Chroma on perchance) isnt trained to comprehend negative vectors. Tho lodestones (creator of Chroma) never specified but I assume not.

Negatives are an off shoot in training that sorta worked in CLIP based models SD1.5 SDXL (Pony / illustrious) that carried over into the nat lang mod releases out of habit in the community.

It worked in CLIP models because the CLIP encoder is simple. Write ‘ice cream’ in CLIP and the text encoding vector will point roughly same direction no matter where in the 75 token batch ‘ice cream’ .

Compare that to the many number of different answers you can get from chatGPT or GROK containing the word ‘ice cream’ and you can see how the 512 token batch encoding of the T5 in Chroma , or Qwen encoder in Klein / Z image varies drastically depending how common words are arranged in the text.

RandomPerchanceUser@lemmy.world · 3 months ago

Ask the llm

RandomPerchanceUser@lemmy.world · 3 months ago

Its gpt-4 by openAI , you can ask the llm model itself who it is https://perchance.org/ai-text-generator

RandomPerchanceUser@lemmy.world · 5 months ago

Perchance chat is gonna get the site shut down I swear.

Any forum with a buncha unhinged weirdos + underage users is trouble waiting to happen for the site. People wanna chat , do that on Lemmy or whatever other moderated forums there are IMO.

RandomPerchanceUser@lemmy.world · 5 months ago

Tell you what ; If ganking up on strangers online like this is what happens in perchance chats I’m happy the 16 year olds stay clear of it.

Have your internet points. Just keep whatever strange community you have in this chat away from me.

RandomPerchanceUser@lemmy.world · 5 months ago

Is nutty behavior seeing a 16 year old on a chat forum and reacting that he should be banned.

Like… what you do is completely crazy behavior.

I’m stopping you in your tracks here bud. Time-out. Is what I wanna say.

RandomPerchanceUser@lemmy.world · 5 months ago

Removed by mod

RandomPerchanceUser@lemmy.world · 5 months ago

Who told you this? Like…why?

RandomPerchanceUser@lemmy.world · 6 months ago

You are disagreeing because of your own ego. Stop it.

I’m not looking to claim any points.

RandomPerchanceUser@lemmy.world · 6 months ago

You ask for info , I give info. Why so distrustful?

RandomPerchanceUser@lemmy.world · edit-2 6 months ago

Perchance model is FLUX Chroma.

FLUX Chroma Flash Heun

Easiest way to get photoreal output is to head off to getty and copy a photo caption: https://www.gettyimages.com/editorial-images

Example output on perchance generator

2025 Toronto International Film Festival - Black Excellence Brunch TORONTO, ONTARIO - SEPTEMBER 08: (L-R) Karen Chapman, Regina Taylor and Joan Jenkinson attend the Black Excellence Brunch during the 2025 Toronto International Film Festival at Petros82 Restaurant on September 08, 2025 in Toronto, Ontario. (Photo by Leon Bennett/Getty Images)

From : https://www.gettyimages.com/editorial-images/entertainment/event/toronto-international-film-festival-black-excellence-brunch/776375508?editorialproducts=all

RandomPerchanceUser@lemmy.world · 7 months ago

I don’t mind hearing what you are actually angry about if its something personal.

RandomPerchanceUser@lemmy.world · 7 months ago

You are not yourself man.

Take a break and get a grip on reality.

Revert to your mental health checklist of things to do.

RandomPerchanceUser@lemmy.world · 8 months ago

Are you ok? I hope you are doing well , man

RandomPerchanceUser@lemmy.world · 8 months ago

Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

RandomPerchanceUser@lemmy.world · edit-2 8 months ago

Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

RandomPerchanceUser@lemmy.world · edit-2 8 months ago

FLUX Chroma (Perchance T2i model) is literally the best model on the market haha.

I’ve collected what prompt I can find on Chroma training data here (see screenshot above for output): https://huggingface.co/datasets/codeShare/chroma_prompts/blob/main/parquet_explorer.ipynb

So try mimicing that prompt format 👍

RandomPerchanceUser@lemmy.world · 8 months ago

Good scientific approach!

I collected training prompts from lodestones repo to get an idea of the prompt format for Chroma: https://huggingface.co/datasets/codeShare/chroma_prompts/blob/main/parquet_explorer.ipynb

Still early stages so you’ll have to download the .parquet file to your own Google Drive and access it via the notebook from there

RandomPerchanceUser@lemmy.world · 8 months ago

Chroma uses FLUX Schnell as its basis , yes. See the HF page.

RandomPerchanceUser@lemmy.world · edit-2 8 months ago

what is the aesthetic 0 style type of art? anime screencap with a title in red text Fox-like girl holding a wrench and a knife, dressed in futuristic armor, looking fierce with yellow eyes. Her outfit is a dark green cropped jacket and a skirt-like bottom. \: title the aesthetic 0 style poster "Aesthetic ZERO"

(Current Perchance Chroma model could be an early epoch and not changed until epoch 50 finishes training)

Chroma Epoch 48 (latest one)

RandomPerchanceUser@lemmy.world · 8 months ago

SeeArt is a strange website indeed

RandomPerchanceUser@lemmy.world · edit-2 8 months ago

All images used to train Chroma (Perchance T2i model) are tagged with the 'aesthetic' tag

RandomPerchanceUser@lemmy.world · 8 months ago

Are you a human?

RandomPerchanceUser@lemmy.world · 8 months ago

FLUX Chroma (The Perchance T2i model)