# AIport API — Developer Documentation

> Generated from the live API surface. Base URL: `https://aiport.id`. Exchange rate snapshot: Rp 18,500 / USD.

## Table of contents

- [Introduction](#introduction)
- [Authentication](#authentication)
- [Quickstart](#quickstart)
- [Text completions](#text)
- [Tool calling](#tool-calling)
- [Image generation](#image)
- [Image-to-image](#img2img)
- [Video generation](#video)
- [Polling jobs](#jobs)
- [Errors](#errors)
- [Available models](#models)

## Introduction

AIport is a unified AI API gateway for Indonesian developers — one API key, one IDR balance, every modality.

**Why AIport.**

- **Top up in IDR.** No FX markup, no monthly fee, no expiry. Each call deducts exactly what it costs from your balance.
- **One key, every modality.** Text, image, image-to-image, and video generation behind a single `Authorization: Bearer` header.
- **Provider routing & auto-refund.** When a provider errors mid-job, credits are refunded automatically.
- **OpenAI-compatible text endpoint.** `POST /v1/chat/completions` is a drop-in for the OpenAI SDK, including `stream: true`.

**Base URL**

```
https://aiport.id
```

All endpoints are JSON unless noted. Async modalities (image, img2img, video) return a `job_id` you poll via `GET /v1/jobs/{job_id}`.

## Authentication

Every /v1/* call requires a Bearer token. Create one in your dashboard at /dashboard/api-keys.

Send your API key in the `Authorization` header:

```http
Authorization: Bearer aip_live_...
```

Keys are scoped to your account and bill against your IDR balance. Rotate compromised keys immediately by deleting and re-creating from the dashboard. **Never embed an API key in client-side code** — proxy through your own backend.

## Quickstart

Every endpoint is plain HTTP + JSON. The only thing that varies is whether the response is synchronous or asynchronous.

**Sync vs async — read this first.**

Every AIport endpoint is a normal HTTP request: `POST` (or `GET` for jobs), `Authorization: Bearer` header, JSON body. The difference is only in how the result comes back:

- **Synchronous** — `/v1/chat/completions` (alias `/v1/text/completions`). The response body *is* the result, unless you opt into `stream: true`. Typical latency: 1–10 s.
- **Asynchronous** — `/v1/image/generate`, `/v1/image/img2img`, all `/v1/video/*`. The POST returns immediately with `{ "job_id": "...", "status": "pending" }`. You then poll `GET /v1/jobs/{job_id}` every few seconds until `status` becomes `done` (read `result_url`) or `failed`. Typical latency: 30–120 s for video, 5–30 s for image.

Async exists because images and videos can take a minute or more — too long to hold open a single HTTP request reliably.

**1. Synchronous: text completion**

The text endpoint is OpenAI-compatible — point the OpenAI SDK at `https://aiport.id/v1`, or call it directly:

```bash
curl -X POST https://aiport.id/v1/chat/completions \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek-v4-flash","messages":[{"role":"user","content":"Hello"}]}'
```

Response (200) — final answer is already inside `choices[0].message.content`:

```json
{ "choices": [ { "message": { "role": "assistant", "content": "Hi!" } } ], "usage": { ... } }
```

Add `"stream": true` to receive the answer incrementally as Server-Sent Events (OpenAI `chat.completion.chunk` objects, ending with `data: [DONE]`). See the Text completions section for the chunk shape.

**2. Asynchronous: image generation — step by step**

Step 2a — `POST` the request. Same shape as text, just a different URL:

```bash
curl -X POST https://aiport.id/v1/image/generate \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-image-2","prompt":"Sunset over Mount Bromo","resolution":"2k"}'
```

Response (202) — note: no image yet, only a job id to poll:

```json
{ "job_id": "job_01HX...", "status": "pending" }
```

Step 2b — `GET` the job until it's done:

```bash
curl https://aiport.id/v1/jobs/job_01HX... \
  -H "Authorization: Bearer $AIPORT_API_KEY"
```

Response (200) — when finished, `status` is `done` and `result_url` points at the image:

```json
{ "job_id": "job_01HX...", "status": "done", "result_url": "https://uploads.aiport.id/results/..." }
```

**3. Same thing, scripted with a poll loop (optional)**

The block below is just the two-step flow wrapped in a shell loop — useful for trying things in a terminal:

```bash
# Submit
JOB=$(curl -s -X POST https://aiport.id/v1/image/generate \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-image-2","prompt":"Sunset over Mount Bromo","resolution":"2k"}' | jq -r .job_id)

# Poll every 3s
while true; do
  STATUS=$(curl -s https://aiport.id/v1/jobs/$JOB \
    -H "Authorization: Bearer $AIPORT_API_KEY")
  echo "$STATUS"
  case "$(echo $STATUS | jq -r .status)" in
    done|failed) break ;;
  esac
  sleep 3
done
```

In production, replace the shell loop with a job-polling routine in your language of choice (or a webhook on your side if you don't want to poll).

## Text completions

Chat-style generation with an OpenAI-compatible request/response shape. POST /v1/chat/completions (alias: /v1/text/completions). Synchronous by default, or set stream: true for Server-Sent Events. GET /v1/models lists the available text models for OpenAI-compatible clients.

### POST `/v1/chat/completions`

Chat-style text generation. OpenAI-compatible request and response shapes — a drop-in for the OpenAI SDK (set base_url to https://aiport.id/v1). The legacy path /v1/text/completions is a backward-compatible alias.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: application/json`

- `messages` *(ChatMessage[])* (**required**) — Conversation history. Each message: { role: 'system' | 'user' | 'assistant', content: string }. Total content length capped at 20000 characters.
- `model` *(string)* (default `deepseek-v4-flash`) — Text model id. See the Models section for the full list of available text models.
- `max_tokens` *(integer)* — Upper bound on generated tokens. Capped by the model's own max_output_tokens.
- `temperature` *(number)* — Sampling temperature, range 0–2.
- `reasoning_effort` *(string)* — Only honored by models that advertise reasoning support.
  - Allowed: `low`, `medium`, `high`
- `stream` *(boolean)* (default `false`) — When true, the response is streamed as Server-Sent Events of OpenAI chat.completion.chunk objects, terminated by a data: [DONE] line. See the Streaming example below.
- `tools` *(Tool[])* — Functions the model may call, in the OpenAI shape: { type: 'function', function: { name, description, parameters } } where parameters is a JSON Schema object. When the model decides to call one, the response message has tool_calls and finish_reason: 'tool_calls'. See the Tool calling section.
- `tool_choice` *(string | object)* — Controls whether the model calls a tool. 'auto' (default when tools are present) lets the model decide, 'none' forces a text reply, 'required' forces some tool call, or pass { type: 'function', function: { name } } to force a specific one.
  - Allowed: `auto`, `none`, `required`
- `parallel_tool_calls` *(boolean)* (default `true`) — Whether the model may emit multiple tool calls in a single turn. Forwarded to upstreams that support it; ignored otherwise.

**Example**

```bash
curl -X POST https://aiport.id/v1/chat/completions \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      {"role": "user", "content": "Write a haiku about Jakarta traffic."}
    ]
  }'
```

**Response** — `200`

```json
{
  "id": "chatcmpl_...",
  "object": "chat.completion",
  "created": 1716480000,
  "model": "deepseek-v4-flash",
  "choices": [
    {
      "index": 0,
      "message": {"role": "assistant", "content": "..."},
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 38,
    "total_tokens": 50,
    "prompt_tokens_details": {"cached_tokens": 0},
    "completion_tokens_details": {"reasoning_tokens": 0}
  }
}
```

> Synchronous unless stream is true. prompt_tokens_details.cached_tokens reflects prompt-cache hits (billed at a reduced input rate when the upstream reports them, currently DeepSeek); completion_tokens_details.reasoning_tokens counts reasoning output. Standard OpenAI params (top_p, stop, presence_penalty, frequency_penalty, seed, …) are forwarded verbatim to OpenAI-compatible upstreams; unknown params are dropped.

**Streaming** — `stream: true`

With stream: true the response is text/event-stream. Each event is a chat.completion.chunk: the first carries delta.role, the rest carry delta.content, then a chunk with finish_reason: stop, a final usage-only chunk (choices: []), and the data: [DONE] terminator. If the upstream fails mid-stream the connection ends with an in-band error event — there is no provider switch once the first byte is sent, and a stream that already emitted output is not refunded.

```
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1716480000,"model":"deepseek-v4-flash","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1716480000,"model":"deepseek-v4-flash","choices":[{"index":0,"delta":{"content":"Hi"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1716480000,"model":"deepseek-v4-flash","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1716480000,"model":"deepseek-v4-flash","choices":[],"usage":{"prompt_tokens":12,"completion_tokens":38,"total_tokens":50,"prompt_tokens_details":{"cached_tokens":0},"completion_tokens_details":{"reasoning_tokens":0}}}

data: [DONE]
```

**Errors**

- `400` invalid_request_error — messages missing, content too long, invalid temperature, or reasoning_effort on a non-reasoning model.
- `401` unauthorized — Missing or invalid API key.
- `404` MODEL_NOT_FOUND — Requested model id does not exist or is disabled.
- `429` insufficient_quota — IDR balance cannot cover the request — top up to continue.
- `502` provider_error — Upstream AI provider failed. Non-streamed calls are refunded; a stream that already emitted output is not.

### GET `/v1/models`

List the text/chat models available to your key, in the OpenAI models shape. OpenAI-compatible clients call this to verify the base URL and populate model pickers. GET /v1/models/{model} retrieves a single model.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: —`

**Example**

```bash
curl https://aiport.id/v1/models \
  -H "Authorization: Bearer $AIPORT_API_KEY"
```

**Response** — `200`

```json
{
  "object": "list",
  "data": [
    {
      "id": "deepseek-v4-flash",
      "object": "model",
      "created": 1716480000,
      "owned_by": "aiport"
    }
  ]
}
```

> Only enabled text/chat models are returned — the same ids you can pass as model to /v1/chat/completions. Image and video models are not listed here (those modalities are async and not OpenAI-compatible). created is a Unix timestamp in seconds. GET /v1/models/{model} returns a single model object, or 404 model_not_found if the id is unknown, disabled, or not a text model.

**Errors**

- `401` unauthorized — Missing or invalid API key.
- `404` model_not_found — GET /v1/models/{model}: id does not exist, is disabled, or is not a text model.

## Tool calling

Let a model call your functions. Works on /v1/chat/completions with the standard OpenAI tools / tool_choice fields — pass tool definitions, run the call yourself, feed the result back. Supported across OpenAI-compatible, DeepSeek, and Gemini models.

Tool calling is a round-trip you drive. You describe the functions the model is allowed to call; the model replies with the function name and JSON arguments; **you** execute it and send the result back; the model uses that result to answer.

**1. Send a request with `tools`.**

```bash
curl -X POST https://aiport.id/v1/chat/completions \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      {"role": "user", "content": "What is the weather in Jakarta?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {"type": "string", "description": "City name"}
            },
            "required": ["city"]
          }
        }
      }
    ]
  }'
```

**2. The model asks to call the tool.** Instead of `content`, the assistant message carries `tool_calls` and `finish_reason` is `tool_calls`. `arguments` is a JSON **string** you must parse:

```json
{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\":\"Jakarta\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}
```

**3. Run the function and reply with a `tool` message.** Append the assistant's `tool_calls` message verbatim, then one `role: "tool"` message per call, echoing its `tool_call_id`. **Keep sending `tools` on this follow-up request too** — resubmit it on every turn of the conversation, exactly as the OpenAI SDK does, or some models reject the replayed tool history:

```bash
curl -X POST https://aiport.id/v1/chat/completions \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      {"role": "user", "content": "What is the weather in Jakarta?"},
      {"role": "assistant", "content": null, "tool_calls": [
        {"id": "call_abc123", "type": "function",
         "function": {"name": "get_weather", "arguments": "{\"city\":\"Jakarta\"}"}}
      ]},
      {"role": "tool", "tool_call_id": "call_abc123", "content": "{\"temp_c\":31,\"sky\":\"partly cloudy\"}"}
    ],
    "tools": [
      {"type": "function", "function": {"name": "get_weather",
        "description": "Get the current weather for a city",
        "parameters": {"type": "object",
          "properties": {"city": {"type": "string", "description": "City name"}},
          "required": ["city"]}}}
    ]
  }'
```

The model now replies with normal `content` (`finish_reason: "stop"`):

```json
{ "choices": [ { "message": { "role": "assistant", "content": "It's 31°C and partly cloudy in Jakarta." }, "finish_reason": "stop" } ] }
```

**Forcing or disabling calls.** Use `tool_choice`: `"auto"` (default when `tools` is set) lets the model decide, `"required"` forces some tool call, `"none"` forces a plain-text reply, and `{ "type": "function", "function": { "name": "get_weather" } }` forces one specific function. Set `parallel_tool_calls: false` to limit the model to one call per turn.

**Streaming.** With `stream: true`, tool calls arrive incrementally on `delta.tool_calls` (each fragment carries an `index`; `function.arguments` is concatenated across chunks). Accumulate by `index` until the terminating chunk with `finish_reason: "tool_calls"`.

**Round-trip integrity.** Send each assistant `tool_calls` message back exactly as you received it — including the `id` on every call — and pair it with a matching `tool_call_id` in your `tool` reply. Some models (e.g. Gemini reasoning models) attach opaque state to the call `id` that must survive the round-trip, so don't rewrite or drop it.

## Image generation

Text-to-image. Async — returns a job_id; poll /v1/jobs/{job_id} for the result. Pricing varies by resolution tier on supporting models.

### POST `/v1/image/generate`

Text-to-image. Returns an async job_id — poll GET /v1/jobs/{job_id} until status is done.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: application/json`

- `prompt` *(string)* (**required**) — Description of the image to generate.
- `model` *(string)* (default `gpt-image-2`) — Image model id. Capability differs per model — some support resolution tiers, some accept aspect_ratio, some accept neither.
- `aspect_ratio` *(string)* — Output aspect ratio. Ignored by models that don't support it.
  - Allowed: `3:2`, `1:1`, `2:3`, `5:4`, `4:5`, `16:9`, `9:16`, `21:9`, `3:4`, `4:3`, `9:21`
- `resolution` *(string)* — Output resolution tier. Determines pricing for models that support it; ignored otherwise.
  - Allowed: `1k`, `2k`, `4k`

**Example**

```bash
curl -X POST https://aiport.id/v1/image/generate \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A traditional Balinese temple at golden hour, cinematic",
    "aspect_ratio": "16:9",
    "resolution": "2k"
  }'
```

**Response** — `202`

```json
{
  "job_id": "job_01HX...",
  "status": "pending"
}
```

> Poll GET /v1/jobs/{job_id} every 2–5 seconds until status is 'done' (then read result_url) or 'failed'. Image generation typically takes 10–60 seconds at 1k, 30–120 seconds at 2k, and up to 180 seconds at 4k. Jobs that exceed 10 minutes are auto-failed and credits are refunded.

**Errors**

- `400` invalid_request — Missing prompt, invalid aspect_ratio, or invalid resolution.
- `401` unauthorized — Missing or invalid API key.
- `402` insufficient_balance — IDR balance cannot cover the request.
- `404` MODEL_NOT_FOUND — Requested model id does not exist or is disabled.

## Image-to-image

Edit or transform an existing image. Upload reference images first (or pass any HTTPS URL), then call /v1/image/img2img.

### POST `/v1/image/upload`

Upload a reference image to be used by /v1/image/img2img. Returns an upload_id you reference in the img2img call.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: multipart/form-data`

- `file` *(file)* (**required**) — Image file. Max 30 MB. JPEG, PNG, or WEBP.

**Example**

```bash
curl -X POST https://aiport.id/v1/image/upload \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -F "file=@./reference.jpg"
```

**Response** — `200`

```json
{
  "upload_id": "upl_01HX...",
  "url": "https://uploads.aiport.id/uploads/..."
}
```

**Errors**

- `400` — Missing file or unsupported MIME type.
- `401` unauthorized — Missing or invalid API key.
- `413` — File exceeds the per-modality size limit.

### POST `/v1/image/img2img`

Image-to-image editing. Provide reference image URL(s) (from /v1/image/upload or any HTTPS URL) plus a prompt. Returns an async job_id.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: application/json`

- `prompt` *(string)* (**required**) — Edit/transform instruction.
- `images` *(string[])* (**required**) — One or more HTTPS image URLs. Use /v1/image/upload to get hosted URLs for local files.
- `model` *(string)* — Image-to-image model id.
- `aspect_ratio` *(string)* — Output aspect ratio. Ignored by models that don't support it.
  - Allowed: `3:2`, `1:1`, `2:3`, `5:4`, `4:5`, `16:9`, `9:16`, `21:9`, `3:4`, `4:3`, `9:21`
- `resolution` *(string)* — Output resolution tier.
  - Allowed: `1k`, `2k`, `4k`

**Example**

```bash
curl -X POST https://aiport.id/v1/image/img2img \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Make it look like a Studio Ghibli scene",
    "images": ["https://uploads.aiport.id/uploads/.../ref.jpg"],
    "resolution": "2k"
  }'
```

**Response** — `202`

```json
{ "job_id": "job_01HX...", "status": "pending" }
```

> Poll GET /v1/jobs/{job_id} every 2–5 seconds. Image-to-image typically takes 10–60 seconds at 1k and 30–180 seconds at higher resolutions. Jobs that exceed 10 minutes are auto-failed and credits are refunded.

**Errors**

- `400` — Missing prompt, invalid image URL, or unsupported model parameter.
- `401` unauthorized — Missing or invalid API key.
- `402` insufficient_balance — IDR balance cannot cover the request.

## Video generation

Cinematic video generation, billed per generated second by output resolution. Three flavors: text-to-video, image-to-video, and multimodal (multiple references).

### POST `/v1/video/generate`

Text-to-video. Async — returns job_id, then poll /v1/jobs/{job_id} until done. Billing is per generated second by output resolution.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: application/json`

- `prompt` *(string)* (**required**) — Description of the video to generate.
- `model` *(string)* (default `seedance-2.0-fast`) — Video model id.
- `resolution` *(string)* — Output resolution. Determines per-second price.
  - Allowed: `480p`, `720p`, `1080p`, `2k`, `4k`
- `duration` *(integer)* — Length in seconds (4–15).
  - Allowed: `4`, `5`, `6`, `7`, `8`, `9`, `10`, `11`, `12`, `13`, `14`, `15`
- `ratio` *(string)* — Aspect ratio. 'adaptive' lets the model pick.
  - Allowed: `adaptive`, `16:9`, `4:3`, `1:1`, `3:4`, `9:16`, `21:9`
- `generate_audio` *(boolean)* — Synthesize audio alongside video. Model-dependent.
- `web_search` *(boolean)* — Allow the model to ground on web search results. Model-dependent.
- `return_last_frame` *(boolean)* — Also return the final frame as an image URL.
- `seed` *(integer)* — Deterministic seed. -1 means random.

**Example**

```bash
curl -X POST https://aiport.id/v1/video/generate \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0-fast",
    "prompt": "A drone shot flying over rice terraces at sunrise",
    "resolution": "1080p",
    "duration": 6,
    "ratio": "16:9"
  }'
```

**Response** — `202`

```json
{ "job_id": "job_01HX...", "status": "pending" }
```

> Video jobs typically take 30–120 seconds. Poll every 3–5 seconds.

**Errors**

- `400` — Invalid resolution, duration, ratio, or missing prompt.
- `401` unauthorized — Missing or invalid API key.
- `402` insufficient_balance — IDR balance cannot cover the request.

### POST `/v1/video/img2video`

Image-to-video. Provide a first frame (and optional last frame) URL; the model interpolates motion. Async job.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: application/json`

- `prompt` *(string)* (**required**) — Motion / scene description.
- `first_frame_url` *(string)* (**required**) — HTTPS URL of the starting frame.
- `last_frame_url` *(string)* — Optional HTTPS URL of the ending frame.
- `model` *(string)* (default `seedance-2.0-fast-img2video`) — Image-to-video model id.
- `resolution` *(string)* — Output resolution.
  - Allowed: `480p`, `720p`, `1080p`, `2k`, `4k`
- `duration` *(integer)* — Length in seconds (4–15).
  - Allowed: `4`, `5`, `6`, `7`, `8`, `9`, `10`, `11`, `12`, `13`, `14`, `15`
- `real_person_mode` *(boolean)* — Hint to the model that the reference depicts a real person — improves identity preservation.
- `conversion_slots` *(string[])* — Which slots may be transcoded/normalized by the platform. Default: ['all'].
  - Allowed: `all`, `firstFrameUrl`, `lastFrameUrl`

**Example**

```bash
curl -X POST https://aiport.id/v1/video/img2video \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0-fast-img2video",
    "prompt": "Camera slowly pushes in, leaves drift across frame",
    "first_frame_url": "https://uploads.aiport.id/uploads/.../start.jpg",
    "resolution": "720p",
    "duration": 5
  }'
```

**Response** — `202`

```json
{ "job_id": "job_01HX...", "status": "pending" }
```

**Errors**

- `400` — Missing first_frame_url, invalid duration/resolution.
- `401` unauthorized — Missing or invalid API key.
- `402` insufficient_balance — IDR balance cannot cover the request.

### POST `/v1/video/multimodal`

Multimodal video generation — combine up to 9 reference images, 3 reference videos, and 3 reference audio tracks with a text prompt. Per-resolution per-second billing with a min-billable-seconds floor (see /pricing).

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: application/json`

- `prompt` *(string)* (**required**) — Scene / motion description.
- `model` *(string)* (default `seedance-2.0-fast-multimodal`) — Multimodal video model id.
- `images` *(string[])* — Up to 9 reference image URLs (max 30 MB each).
- `videos` *(string[])* — Up to 3 reference video URLs (max 50 MB each).
- `audios` *(string[])* — Up to 3 reference audio URLs (max 50 MB each).
- `resolution` *(string)* — Output resolution.
  - Allowed: `480p`, `720p`, `1080p`, `2k`, `4k`
- `duration` *(integer)* — Requested length in seconds.
  - Allowed: `4`, `5`, `6`, `7`, `8`, `9`, `10`, `11`, `12`, `13`, `14`, `15`
- `conversion_slots` *(string[])* — Per-asset transcoding hints, e.g. ['image1', 'video2']. Default: ['all'].
  - Allowed: `all`, `image1`, `image2`, `image3`, `image4`, `image5`, `image6`, `image7`, `image8`, `image9`, `video1`, `video2`, `video3`

**Example**

```bash
curl -X POST https://aiport.id/v1/video/multimodal \
  -H "Authorization: Bearer $AIPORT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0-fast-multimodal",
    "prompt": "Subject from image1 walking through scene from video1",
    "images": ["https://uploads.aiport.id/uploads/.../subject.jpg"],
    "videos": ["https://uploads.aiport.id/uploads/.../scene.mp4"],
    "resolution": "720p",
    "duration": 6
  }'
```

**Response** — `202`

```json
{ "job_id": "job_01HX...", "status": "pending" }
```

> When any reference asset is supplied, billing uses the 'with reference' tier and a per-duration min-billable floor — see /pricing for the exact table.

**Errors**

- `400` — Too many reference assets, invalid URL, or invalid resolution/duration.
- `401` unauthorized — Missing or invalid API key.
- `402` insufficient_balance — IDR balance cannot cover the request.

## Polling jobs

Async modalities (image, img2img, video) return a job_id. Poll for status here.

### GET `/v1/jobs/{job_id}`

Poll an async job (image, img2img, or video). Returns the current status and — when done — the result URL.

**Auth:** `Authorization: Bearer <api_key>`

**Request** — `Content-Type: —`

- `job_id` *(string)* (**required**) — Path parameter. The job_id returned from a generate call.

**Example**

```bash
curl https://aiport.id/v1/jobs/job_01HX... \
  -H "Authorization: Bearer $AIPORT_API_KEY"
```

**Response** — `200`

```json
{
  "job_id": "job_01HX...",
  "status": "done",
  "result_url": "https://uploads.aiport.id/results/...",
  "error_message": null,
  "created_at": "2026-05-23T08:00:00Z"
}
```

> status is one of: 'pending', 'processing', 'done', 'failed'. On 'failed', check error_message for details (e.g. 'Provider failed. Please try again.' or 'Job timed out after N minutes'). Credits are auto-refunded on failure.

**Errors**

- `401` unauthorized — Missing or invalid API key.
- `404` — Job does not exist or does not belong to this API key's owner.

## Errors

All errors return an OpenAI-style JSON body of the form { "error": { "message": "...", "type": "...", "param": null, "code": "..." } } with a meaningful HTTP status.

| Status | Meaning |
| --- | --- |
| 400 | Invalid request: missing or malformed parameters. |
| 401 | Missing or invalid `Authorization` header. |
| 402 | Insufficient IDR balance on async modalities (image, img2img, video) — top up at `/dashboard/topup`. |
| 404 | Resource not found (unknown model id, unknown job_id, or job belongs to another user). |
| 413 | Upload exceeded the per-modality size limit. |
| 429 | `insufficient_quota`: IDR balance cannot cover an OpenAI-compatible text request — top up. Also returned when rate-limited — back off and retry. |
| 5xx | Upstream provider failure. Credits for failed async jobs are auto-refunded. |

## Available models

The list below reflects the current production catalog. See `/pricing` for per-tier IDR rates.

### Text

| Model id | Display name |
| --- | --- |
| `deepseek-v4-flash` | DeepSeek V4 Flash |
| `gpt-5.4-mini` | GPT 5.4 mini |
| `gpt-5.4` | GPT 5.4 |
| `gpt-5.5` | GPT 5.5 |
| `gemini-3.1-flash-lite` | Gemini 3.1 Flash Lite |
| `claude-opus-4-8` | Claude Opus 4.8 |
| `deepseek-v4-pro` | DeepSeek V4 Pro |
| `claude-sonnet-4-6` | Claude Sonnet 4.6 |
| `claude-haiku-4-5` | Claude Haiku 4.5 |
| `gemini-3.5-flash` | Gemini 3.5 Flash |
| `minimax-m3` | MiniMax M3 |
| `glm-5.2` | GLM 5.2 |
| `glm-5.2-fast` | GLM 5.2 Fast |
| `kimi-k2.6` | Kimi K2.6 |
| `kimi-k2.6-fast` | Kimi K2.6 Fast |
| `kimi-k2.7-code` | Kimi K2.7 Code |

### Image

| Model id | Display name |
| --- | --- |
| `gpt-image-2` | GPT Image 2 |
| `nano-banana2-flash` | Nano Banana 2 Flash |
| `grok-image` | Grok Image |
| `z-image-turbo` | Z-Image Turbo |
| `wan-2.7-text2img` | WAN 2.7 |

### Image-to-image

| Model id | Display name |
| --- | --- |
| `gpt-image-2-img2img` | GPT Image 2 |
| `nano-banana2-img2img` | Nano Banana 2 |
| `grok-image-img2img` | Grok Image |
| `z-image-turbo-img2img` | Z-Image Turbo |
| `wan-2.7-img2img` | WAN 2.7 |

### Video

| Model id | Display name |
| --- | --- |
| `seedance-2.0-fast` | Seedance 2.0 Fast |
| `seedance-2.0-fast-img2video` | Seedance 2.0 Fast (img2video) |
| `seedance-2.0-fast-multimodal` | Seedance 2.0 Fast (multimodal) |
