11 KiB
GPT Image 2 API Guide
This guide describes how to call gpt-image-2 through sub2api or any OpenAI-compatible gateway.
Default examples use:
BASE_URL=https://claude.omniclaw.store/v1
API_KEY=<sub2api API key generated from the /keys page>
Do not use ChatGPT OAuth tokens from .codex/auth.json as API keys.
Quick Summary
- Direct image generation: call
POST /v1/images/generationswithmodel: "gpt-image-2". - Image editing: call
POST /v1/images/editswith multipartimage[]files and an optionalmask. - Agent/Codex workflows: keep the main model as a text/agent model such as
gpt-5.5, then call image generation through the Responses APIimage_generationtool. - Do not use
gpt-image-2as the Codex main model. gpt-image-2normally returns base64 image data atdata[0].b64_json.3840x21604K output works but is high-latency and high-cost; use 180-300 second timeouts for production.
Official Capability Summary
gpt-image-2 is an image generation and editing model with text input, image input, and image output support.
Model aliases:
gpt-image-2
gpt-image-2-2026-04-21
Supported API surfaces:
/v1/images/generations
/v1/images/edits
/v1/responses # via image_generation tool
Official references:
- https://developers.openai.com/api/docs/models/gpt-image-2
- https://developers.openai.com/api/docs/guides/image-generation
- https://developers.openai.com/api/reference/resources/images
Authentication
export BASE_URL="https://claude.omniclaw.store/v1"
export API_KEY="sk-..."
JSON requests require:
Authorization: Bearer $API_KEY
Content-Type: application/json
For multipart image edits, let curl -F or the SDK set Content-Type.
Image Generation
Minimal Request
curl -sS "$BASE_URL/images/generations" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2",
"prompt": "A compact Apple-style dashboard UI, clean white background",
"size": "1024x1024",
"quality": "medium",
"output_format": "png",
"n": 1
}' > image.json
Decode the response:
jq -r '.data[0].b64_json' image.json | base64 --decode > image.png
4K Request
curl -sS "$BASE_URL/images/generations" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
--max-time 300 \
-d '{
"model": "gpt-image-2",
"prompt": "A modern product poster, cinematic lighting, premium realistic photography",
"size": "3840x2160",
"quality": "medium",
"output_format": "png",
"n": 1
}' > image-4k.json
Production recommendation: first validate prompts with 1024x1024 or 1536x1024, then upscale the request to 3840x2160. 4K + high can be slow and expensive.
Generation Parameters
| Parameter | Type | Recommended value | Notes |
|---|---|---|---|
model |
string | gpt-image-2 |
Required. The snapshot gpt-image-2-2026-04-21 is also valid. |
prompt |
string | detailed natural language | Required. Include subject, environment, camera, style, lighting, and constraints. |
n |
number | 1 |
Number of images. Prefer single-image requests for retry and billing attribution. |
size |
string | 1024x1024, 1536x1024, 3840x2160 |
Flexible sizes are supported when they satisfy the model constraints. |
quality |
string | low, medium, high, auto |
Use low for drafts, medium for normal output, high for final assets. |
output_format |
string | png, jpeg, webp |
Default is usually png; use jpeg for latency-sensitive outputs. |
output_compression |
number | 0-100 |
Only applies to jpeg and webp. |
background |
string | auto, opaque |
gpt-image-2 currently does not support transparent. |
moderation |
string | auto, low |
Adjusts filtering level but does not bypass safety policy. |
stream |
boolean | false |
Enables SSE image streaming. |
partial_images |
number | 0-3 |
Streaming only; partial images increase output token cost. |
user |
string | end-user ID | Useful for audit and abuse monitoring. |
Size Constraints
size can be auto or a valid widthxheight value:
- Maximum edge length is
3840px. - Width and height must both be multiples of
16px. - Long edge to short edge ratio must be at most
3:1. - Total pixels must be between
655,360and8,294,400.
Common values:
1024x1024
1536x1024
1024x1536
2048x2048
2048x1152
3840x2160
2160x3840
auto
Treat outputs larger than 2560x1440 as experimental high-pixel workloads with higher latency, higher cost, and higher failure probability.
Response Shape
Typical response:
{
"created": 1770000000,
"background": "opaque",
"data": [
{
"b64_json": "...",
"revised_prompt": "..."
}
],
"model": "gpt-image-2",
"output_format": "png",
"quality": "medium",
"size": "1024x1024",
"usage": {
"input_tokens": 43,
"input_tokens_details": {
"image_tokens": 0,
"text_tokens": 43
},
"output_tokens": 196,
"output_tokens_details": {
"image_tokens": 196,
"text_tokens": 0
},
"total_tokens": 239
}
}
Production systems should store:
modelsizequalityoutput_formatusage.total_tokensusage.input_tokensusage.output_tokens- latency
- upstream account, group, user, and key identifiers
Image Editing
Single-image Edit
curl -sS "$BASE_URL/images/edits" \
-H "Authorization: Bearer $API_KEY" \
-F "model=gpt-image-2" \
-F "image[]=@input.png" \
-F "prompt=Replace the sofa with a minimalist white lounge chair" \
-F "size=1024x1024" \
-F "quality=medium" \
-F "output_format=png" \
> edit.json
Masked Local Edit
curl -sS "$BASE_URL/images/edits" \
-H "Authorization: Bearer $API_KEY" \
-F "model=gpt-image-2" \
-F "image[]=@input.png" \
-F "mask=@mask.png" \
-F "prompt=Change only the transparent masked region into a glass button" \
-F "size=1024x1024" \
-F "quality=medium" \
> edit-mask.json
Mask requirements:
imageandmaskmust have the same format and dimensions.- Files must be under 50MB.
maskmust include an alpha channel.- Do not pass
input_fidelityforgpt-image-2; the model processes image inputs at high fidelity by default.
Responses API With image_generation
Use this when an agent should reason about the task before generating an image. The main model should be a text/agent model, such as gpt-5.5.
curl -sS "$BASE_URL/responses" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"input": "Generate a clean product poster for an AI proxy service.",
"tools": [
{
"type": "image_generation",
"quality": "medium",
"size": "1536x1024",
"output_format": "png"
}
]
}' > response-image.json
Important:
modelis the main reasoning model, notgpt-image-2.- The
image_generationtool performs the image work. - sub2api may inject the image tool for official Codex clients, but application calls should pass it explicitly.
Streaming Images
The Images API supports SSE streaming:
curl -N "$BASE_URL/images/generations" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2",
"prompt": "A futuristic city skyline at sunrise",
"stream": true,
"partial_images": 2,
"size": "1536x1024",
"quality": "medium"
}'
Events:
image_generation.partial_image
image_generation.completed
partial_images can be 0-3. Each partial image adds output token cost.
SDK Examples
Node.js
import fs from "node:fs";
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.API_KEY,
baseURL: process.env.BASE_URL ?? "https://claude.omniclaw.store/v1",
});
const result = await client.images.generate({
model: "gpt-image-2",
prompt: "A premium product poster for an AI service",
size: "1536x1024",
quality: "medium",
output_format: "png",
n: 1,
});
const b64 = result.data?.[0]?.b64_json;
if (!b64) throw new Error("No image returned");
fs.writeFileSync("image.png", Buffer.from(b64, "base64"));
Python
import base64
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["API_KEY"],
base_url=os.environ.get("BASE_URL", "https://claude.omniclaw.store/v1"),
)
result = client.images.generate(
model="gpt-image-2",
prompt="A premium product poster for an AI service",
size="1536x1024",
quality="medium",
output_format="png",
n=1,
)
b64 = result.data[0].b64_json
with open("image.png", "wb") as f:
f.write(base64.b64decode(b64))
Production Dispatch
- Routing: prefer plus/team/pro OpenAI OAuth accounts for image workloads.
- Timeout: use 120 seconds for normal images and 300 seconds for 4K.
- Retry: only retry transient network failures and 502/503/504 with low retry counts.
- Concurrency: 4K output produces many image tokens; use low per-account concurrency. Standard 1024 images can use higher concurrency.
- Billing: record
usageand charge based on input and output tokens. 4K can produce far more output tokens than 1024 images. - Latency: use
jpegandquality: lowfor drafts or latency-sensitive previews. - Fallback: if
4K/highfails, retry4K/medium; if that still fails, generate1536x1024/mediumand upscale separately.
Common Errors
| Symptom | Likely cause | Action |
|---|---|---|
401 INVALID_API_KEY |
Key is not a sub2api key or is disabled/deleted | Generate a new key from /keys |
400 invalid_request_error |
Incompatible params such as transparent background or invalid size | Check size, background, and quality |
429 usage_limit_reached |
Upstream account usage window hit | Switch plus/team/pro account or wait for reset |
502 Upstream request failed |
Upstream did not return image data, network failed, or content was refused | Inspect server logs, simplify prompt, lower quality or size |
| Request takes over 2 minutes | High pixels or complex prompt | Increase timeout, use streaming, or test lower resolution first |
/v1/models does not show gpt-image-2 |
Codex/text model list is not the Images API capability list | Call /v1/images/generations directly |
Safety Boundary
Filter clearly disallowed content before sending requests, especially:
- Sexualized minors or young-looking subjects
- Non-consensual sexual content, coercion, or sexual violence
- Explicit nudity or graphic sexual activity
- Illegal, hateful, or extreme violent content
For safe romantic scenes, explicitly constrain prompts with terms such as adult, non-explicit, no nudity, and fully clothed.