HUNYUAN IMAGE
EVOLUTION OF IMAGE GENERATION
Generate images from text prompts

























CREATIVE SOCIAL PORTRAIT

PROFESSIONAL MOBILE PROFILE

STYLIZED MOBILE CONTENT
Hunyuan Image is a state-of-the-art text-to-image generation model developed as part of the Hunyuan Image 3.0 offering, enabling users to transform written prompts into highly relevant visual content. Designed to help users effectively convey the imagery or messaging within their text, Hunyuan Image interprets and renders detailed scenes in response to specific, natural language prompts. This model leverages recent advances in generative image technology, supporting customization and fine control over the resulting artworks.
The input modality for Hunyuan Image is text. Users provide a prompt that describes the desired content, such as "200mm telephoto through crowd gaps; subject laughing, candid; creamy background compression, color pop from a single bold garment, catchlight in eyes." The model analyzes the input and produces visual outputs as images, alongside structured JSON output containing links or data references to the generated images. Outputs can be formatted as PNG or JPEG files, depending on the user's specification.
Several configurable parameters allow fine-tuning of results. Users can set the image size using explicit width and height (within 1 to 14142 pixels for each dimension) or by selecting from predefined aspect ratios such as 'square_hd,' 'portrait_4_3,' 'portrait_16_9,' 'landscape_4_3,' and 'landscape_16_9.' This facilitates the generation of images suitable for varied applications, including publications, presentations, creative design, or social media.
Hunyuan Image supports prompt expansion, which—when enabled—allows a large language model to refine and elaborate upon the user's prompt while preserving its intended meaning. This helps users obtain richer, more detailed visual translations from their original text input. The model also includes a safety checker that can be toggled on or off to ensure appropriate content, with the default setting enabled for safer generations.
Control over the degree of adherence to the input prompt is available via the 'guidance_scale' parameter. Higher values result in images that more strictly follow user instructions, with the permissible range spanning from 1 to 20. The number of inference steps, which influences the granularity and detail of denoising during generation, is adjustable between 1 and 50, with a default of 28 steps. This enables a balance between image detail and speed, depending on user priorities.
Multiple images can be generated from a single prompt in one invocation, with support for generating between 1 and 4 images per request. Users can also set a random seed to ensure deterministic, reproducible results, or allow the model to select a random seed automatically. Negative prompts are supported as well, allowing users to instruct the model to avoid specific visual artifacts or content (for example: "blurry, low quality, watermark, signature").
Image output is available for download as direct links in PNG or JPEG format. There is also a 'sync_mode' option, which, when enabled, delivers the generated media as a data URI and omits it from the request history.
There is no explicit mention of performance benchmarks, limitations, or specific ideal user types in the available research content. However, the model is presented as a commercial solution capable of generating visual content to effectively match the intent and detail of supplied textual material. The documentation provides clear guidance for customizing output and ensuring both technical flexibility and safety in deployment.
最先端の画像モデルで生成
A woman kneeling in darkness, illuminated by a warm, radiant beam of light emerging from her raised hand.
シナリオを記述
スタイル、照明、構図の詳細を含む希望の画像を説明するプロンプトを入力
AIが生成
モデルはシーンの物理、法則、照明、感情的な意図を理解
共有を開始
クリックして最終出力を作成し、プロ品質の画像をダウンロード
プロンプトを超える:新たな制御レベル
CINEMATIC PRESENTATION VISUAL
Exploits Hunyuan's strength in wide, detailed environments, making it apt for presentation backdrops or science fiction worldbuilding.

IMMERSIVE STORY ILLUSTRATION
Showcases the model's proficiency in rendering dynamic, story-driven action scenes with intricate environmental detail for books, games, or cinematic pre-visualization.

PREMIUM PRODUCT SHOWCASE
Highlights technical fidelity and precise lighting control—ideal for advertising and brand visuals needing immaculate realism and composition.

類似モデルと比較
“High-end studio product photography of premium wireless over-ear headphones in matte black finish. Dramatic three-point lighting with soft key light from upper left, rim light highlighting the ear cup contours, and subtle fill. Clean white seamless backdrop with soft gradient. Sharp focus on texture details of the leather headband and brushed metal accents. Professional advertising quality, 8K resolution, photorealistic rendering.”

Hunyuan Imageで完璧を体験しよう
今日から推論ガイダンス合成に切り替えよう
よくある質問
類似モデル

Vidu
Prompt-driven creative image generation
0.2 クレジット

Longcat Image
Fast, multilingual, photorealistic image generation
1.6 クレジット

Flux 2 Pro
Professional sequential image editing tool
0.2 クレジット

Wan v2.6 Text to Image
Flexible multilingual image generation model
0.3 クレジット

Nano Banana Pro
State-of-the-art image generation
0.15 クレジット

Imagineart 1.5 Preview
Superior realism and readable text
0.2 クレジット

Bytedance
Unified image generation and editing
1 クレジット

Reve
Detailed images, accurate text rendering
0.4 クレジット

Piflow
Fast, high-quality image generation
1.2 クレジット










