OpenAI Brings ‘Images 2.0’ to ChatGPT With Improved Realism and Reasoning

0
31

NEW DELHI — OpenAI has rolled out “Images 2.0,” a next-generation image generation model now integrated into ChatGPT, promising more realistic visuals and improved ability to follow complex instructions.

The company said the updated model significantly enhances how accurately it places objects, renders detailed elements, and handles complex inputs such as dense text, user interfaces, and multilingual content. It also supports flexible aspect ratios, allowing users to create visuals suited for formats ranging from social media posts to presentations.

A key upgrade is the addition of “thinking” capabilities. When enabled, the model can use web search for real-time information, generate multiple distinct images from a single prompt, and verify outputs for accuracy and consistency.

OpenAI said the improvements are designed to help users move from initial ideas to finished visual assets with less manual effort.

The model also delivers stronger performance across languages, with improved rendering of non-Latin scripts including Hindi, Japanese, Chinese, Korean, and Bengali, making it more accessible for global users.

In terms of visual quality, Images 2.0 offers enhanced realism and stylistic accuracy across formats such as photography, cinematic stills, manga, and pixel art, with better handling of lighting, textures, and fine details.

The company highlighted a wide range of use cases, including UI mockups, magazine layouts, infographics, handwritten notes, comics, advertisements, and cinematic visuals. It also supports design workflows across platforms such as Canva, Figma, and Adobe.

Developers can access the model through the “gpt-image-2” API, enabling integration into applications for design, marketing, education, and content creation. The tool is also available within ChatGPT and Codex.

OpenAI noted that while the model represents a significant step forward, it still has limitations, particularly in rendering highly complex spatial arrangements or extremely detailed repetitive patterns. Outputs such as diagrams may still require human review.

The company said it has built in multiple safety layers, including prompt- and image-level checks, along with provenance tools such as metadata tagging and watermarking to help prevent harmful or misleading content.

The updated image model is now available, with advanced features accessible to paid users. Pricing for API access varies depending on image quality and resolution. (Source: IANS)