INTRODUCING HUNYUAN IMAGE

HUNYUAN IMAGE

EVOLUTION OF IMAGE GENERATION

Generate images from text prompts

Example 1
Example 2
Example 3
Example 4
Example 5
Example 6
Example 7
Example 8
Example 9
Example 10
Example 11
Example 12
Example 1
Example 2
Example 3
Example 4
Example 5
Example 6
Example 7
Example 8
Example 9
Example 10
Example 11
Example 12
CREATIVE SOCIAL PORTRAIT

CREATIVE SOCIAL PORTRAIT

PROFESSIONAL MOBILE PROFILE

PROFESSIONAL MOBILE PROFILE

STYLIZED MOBILE CONTENT

STYLIZED MOBILE CONTENT

Hunyuan Image is a state-of-the-art text-to-image generation model developed as part of the Hunyuan Image 3.0 offering, enabling users to transform written prompts into highly relevant visual content. Designed to help users effectively convey the imagery or messaging within their text, Hunyuan Image interprets and renders detailed scenes in response to specific, natural language prompts. This model leverages recent advances in generative image technology, supporting customization and fine control over the resulting artworks.

The input modality for Hunyuan Image is text. Users provide a prompt that describes the desired content, such as "200mm telephoto through crowd gaps; subject laughing, candid; creamy background compression, color pop from a single bold garment, catchlight in eyes." The model analyzes the input and produces visual outputs as images, alongside structured JSON output containing links or data references to the generated images. Outputs can be formatted as PNG or JPEG files, depending on the user's specification.

Several configurable parameters allow fine-tuning of results. Users can set the image size using explicit width and height (within 1 to 14142 pixels for each dimension) or by selecting from predefined aspect ratios such as 'square_hd,' 'portrait_4_3,' 'portrait_16_9,' 'landscape_4_3,' and 'landscape_16_9.' This facilitates the generation of images suitable for varied applications, including publications, presentations, creative design, or social media.

Hunyuan Image supports prompt expansion, which—when enabled—allows a large language model to refine and elaborate upon the user's prompt while preserving its intended meaning. This helps users obtain richer, more detailed visual translations from their original text input. The model also includes a safety checker that can be toggled on or off to ensure appropriate content, with the default setting enabled for safer generations.

Control over the degree of adherence to the input prompt is available via the 'guidance_scale' parameter. Higher values result in images that more strictly follow user instructions, with the permissible range spanning from 1 to 20. The number of inference steps, which influences the granularity and detail of denoising during generation, is adjustable between 1 and 50, with a default of 28 steps. This enables a balance between image detail and speed, depending on user priorities.

Multiple images can be generated from a single prompt in one invocation, with support for generating between 1 and 4 images per request. Users can also set a random seed to ensure deterministic, reproducible results, or allow the model to select a random seed automatically. Negative prompts are supported as well, allowing users to instruct the model to avoid specific visual artifacts or content (for example: "blurry, low quality, watermark, signature").

Image output is available for download as direct links in PNG or JPEG format. There is also a 'sync_mode' option, which, when enabled, delivers the generated media as a data URI and omits it from the request history.

There is no explicit mention of performance benchmarks, limitations, or specific ideal user types in the available research content. However, the model is presented as a commercial solution capable of generating visual content to effectively match the intent and detail of supplied textual material. The documentation provides clear guidance for customizing output and ensuring both technical flexibility and safety in deployment.

가장 진보된 이미지 모델로 생성하기

A woman kneeling in darkness, illuminated by a warm, radiant beam of light emerging from her raised hand.

1단계

시나리오 작성

스타일, 조명, 구도 세부 사항과 함께 원하는 이미지를 설명하는 프롬프트를 입력하세요

2단계

AI가 생성합니다

모델이 장면의 물리학, 조명, 감정 의도를 이해합니다

3단계

공유 시작

클릭하여 최종 출력물을 생성하고 프로덕션급 이미지를 다운로드하세요

프롬프트 너머: 새로운 수준의 제어

CINEMATIC PRESENTATION VISUAL

CINEMATIC PRESENTATION VISUAL

Exploits Hunyuan's strength in wide, detailed environments, making it apt for presentation backdrops or science fiction worldbuilding.

CINEMATIC PRESENTATION VISUAL
IMMERSIVE STORY ILLUSTRATION

IMMERSIVE STORY ILLUSTRATION

Showcases the model's proficiency in rendering dynamic, story-driven action scenes with intricate environmental detail for books, games, or cinematic pre-visualization.

IMMERSIVE STORY ILLUSTRATION
PREMIUM PRODUCT SHOWCASE

PREMIUM PRODUCT SHOWCASE

Highlights technical fidelity and precise lighting control—ideal for advertising and brand visuals needing immaculate realism and composition.

PREMIUM PRODUCT SHOWCASE

비슷한 모델과 비교

High-end studio product photography of premium wireless over-ear headphones in matte black finish. Dramatic three-point lighting with soft key light from upper left, rim light highlighting the ear cup contours, and subtle fill. Clean white seamless backdrop with soft gradient. Sharp focus on texture details of the leather headband and brushed metal accents. Professional advertising quality, 8K resolution, photorealistic rendering.

Featured example 1
기다림은 드디어 끝났습니다

Hunyuan Image으로 완벽함을 경험하세요

오늘 추론 기반 합성으로 전환하세요

자주 묻는 질문

Hunyuan Image accepts textual prompts as input, which describe the desired visual content to be generated.