End-to-end guide for files, PDFs, and multimodal context on chat requests.

Typical flow (OpenAI-compatible)

1. POST /files → file_id
2. POST /chat/completions with input_file + file_id in messages
3. (optional) GET /files/:id/content
4. DELETE /files/:id

Reference: Files, Create chat completion.

Referencing files in messages

For array content, use input_file (one mode only):

ModeFields
Uploadedfile_id
URLfile_url (http/https)
Inlinefile_data + filename
{
  "role": "user",
  "content": [
    { "type": "text", "text": "Summarize the attachment" },
    { "type": "input_file", "file_id": "file-xxxxxxxx" }
  ]
}

Mixing modes in one part returns 400.

Upload limits

  • Default max size ~32 MB per file (deployment-specific).
  • Multipart field name file; purpose defaults to assistants.
  • Files are scoped to the API key that uploaded them.

PDF preprocess (gateway)

On chat requests:

{
  "pdf_preprocess": {
    "engine": "ocr",
    "max_pages": 20
  }
}
engineBehavior
nativeForward unchanged
ocrOCR for upstream
markdownExtract Markdown

Also on Anthropic Messages (Create message).

Images

Use image_url with http/https URLs. Pick models whose catalog input_modalities include vision when needed.

Cleanup

  • DELETE stale file_ids on a schedule.
  • Avoid sensitive data in uploads; do not log file bodies.

Related