Rebalance is a high-fidelity image generation model trained on a curated dataset comprising thousands of cosplay photographs and handpicked, high-quality real-world images. All training data was sourced exclusively from publicly accessible internet content, and the dataset explicitly excludes any NSFW material. The primary goal of Rebalance is to produce photorealistic outputs that overcome common AI artifacts—such as an oily, plastic, or overly flat appearance—delivering images with natural texture, depth, and visual authenticity.
Training was conducted in multiple stages, broadly divided into two phases:
The model was trained using two complementary caption formats: plain text and structured JSON. Each data subset employed a tailored JSON schema to guide fine-grained control during generation.
{
"caption": "...",
"image_type": "...",
"image_style": "...",
"lighting_environment": "...",
"tags_list": [...],
"brightness": number,
"brightness_name": "...",
"hpsv3_score": score,
"aesthetics": "...",
"cosplayer": "anonymous_id"
}
Note: Cosplayer names are anonymized (using placeholder IDs) solely to help the model associate multiple images of the same subject during training—no real identities are preserved.
{
"subject": "...",
"foreground": "...",
"midground": "...",
"background": "...",
"composition": "...",
"visual_guidance": "...",
"color_tone": "...",
"lighting_mood": "...",
"caption": "..."
}
In addition to structured JSON, all images were also trained with plain-text captions and with randomized caption dropout (i.e., some training steps used no caption or partial metadata). This dual approach enhances both controllability and generalization.
All training was performed using lrzjason/T2ITrainer, a customized extension of the Hugging Face Diffusers DreamBooth training script. The framework supports advanced text-to-image architectures, including Qwen and Qwen-Edit (2509).
This project builds upon several prior tools developed to enhance controllability and efficiency in diffusion-based image generation and editing:
These tools collectively establish a robust ecosystem for training, editing, and deploying personalized diffusion models with high precision and flexibility.
Feel free to reach out via any of the following channels:
866612947fkdeai|
Buy me a coffee:
|
WeChat:
|