Audio-driven avatar with custom image. Creates super-realistic, lip-synchronized videos with natural dynamics using your own portrait image
Fill in the parameters below and click "Generate" to try this model
Image to animate
Audio file to drive the avatar
Text prompt to guide video generation
Negative prompt to avoid unwanted elements
Video resolution (480p=1 unit/sec, 720p=4 units/sec)
Video segments (1st=~5.8s, additional=5s each)
Number of inference steps
Text guidance scale for classifier-free guidance
Audio guidance scale (higher=exaggerated mouth)
Your inputs will be saved and ready after sign in
Turn up to 4 images into video clips with enhanced quality
Animate images into 1080p HD videos with professional-quality motion.
Combine multiple images into a single 5-second video with creative or precise blending.
Generate videos with consistent characters using 1 to 4 reference images.
Sync any image with audio to create talking avatar videos with humans, animals, or cartoon characters.
Turn images into talking avatars with natural lip-sync and immersive audio from text prompts.
Turn images into 1080p videos with adjustable motion intensity.
Animate images into cinematic videos with dialogue and sound effects.
Turn text and images into talking avatar videos with auto lip-sync and natural voice generation.
Hey! Need help? 👋
Click to chat with us