Kling Omni Image O1
Kling Omni Image O1 is Kuaishou's multi-modal image generation model with MVL technology. Supports up to 10 reference images for feature consistency, precise detail editing (add/remove/modify), style control, and series content creation. Perfect for IP character design, comic panels, and brand merchandise
Example Output
Prompt
"A cute cartoon character with blue hair, wearing a red jacket, standing in a fantasy forest. Add magical sparkles around the character"
Generated Result
Input Parameters
Text prompt for image generation with editing instructions
Reference images for element, scene, style consistency (max 10 images)
Output image aspect ratio
Image generation resolution
Number of images to generate
Sign in to start creating with Kling Omni Image O1
More Image Generation Models
Stable Cascade
Efficient image generation on smaller latent space. Two-stage generation with separate guidance scales. Generates 1-4 images with improved quality and efficiency over standard diffusion models
Z-Image Turbo LoRA
Z-Image-Turbo LoRA (6B) enables ultra-fast text-to-image generation with external LoRA support. Generate photorealistic images in sub-second latency while applying up to 3 LoRAs for custom styles
Nano Banana Pro Text-to-Image
Google's state-of-the-art image generation model (Nano Banana 2) with exceptional realism and typography capabilities. Generate stunning images up to 4K resolution
DeepSeek Janus-Pro
DeepSeek's unified multimodal model with autoregressive framework. Combines understanding and generation. Temperature control for creativity. Generate 1-16 images in parallel with CFG guidance
FLUX.1 SRPO Text-to-Image
FLUX.1 SRPO is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. Suitable for personal and commercial use with enhanced aesthetic quality
FLUX 2 Pro
Text-to-image generation with FLUX.2 [pro] from Black Forest Labs. Optimized for maximum quality, exceptional photorealism and artistic images
Hidream I1 Full
HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds
Qwen Image
A 20B MMDiT model for next-gen text-to-image generation. SOTA text rendering with bilingual support, excels at creating stunning graphic posters with native text
OmniGen V2
Advanced unified multimodal model. Image editing, personalization, virtual try-on, multi-person gen. Up to 3 input images. Separate text and image guidance. Euler/DPMSolver schedulers. CFG range control