讀取用量與費用

呼叫 Zeabur AI Hub 時，每一個成功的回應都會帶上兩項您通常會關心的資料：這次請求用了多少 token、花了多少錢。這篇文件說明該往哪裡讀，方便您接到自己的儀表板、設定每個使用者的預算上限，或在開發階段隨手確認價格。

兩種取得方式

來源	內容	適合用在
Response header	本次費用（USD）、金鑰累計花費、Call ID	最快，不需要解析 body
Response body 的 `usage` 物件	Token 數、prompt 快取明細、推理 token	OpenAI SDK 原本就會讀的位置

兩者每個成功回應都會同時出現，不用二選一，挑符合您程式碼結構的那個用即可。

從 header 讀

每個回應都會帶一組 x-litellm- 開頭的 header：

Header	說明
`x-litellm-response-cost`	本次請求的 USD 費用（浮點數）
`x-litellm-key-spend`	該 API 金鑰目前累計花費的 USD
`x-litellm-call-id`	本次請求的識別碼，回報問題或追蹤時會用到
`x-litellm-model-id`	內部部署的模型 ID
`x-litellm-model-group`	您送出的公開模型名稱

curl

curl -i https://hnd1.aihub.zeabur.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_ZEABUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "你好"}],
    "max_tokens": 32
  }'

-i 會把 response header 一併印出來，看 x-litellm-response-cost 那一行就是費用。

Python

OpenAI 官方 SDK 預設不會把 header 暴露給呼叫端。如果您要從 header 拿費用，直接用 httpx 跑原生 HTTP：

import httpx
 
r = httpx.post(
    "https://hnd1.aihub.zeabur.ai/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_ZEABUR_API_KEY"},
    json={
        "model": "claude-sonnet-4-5",
        "messages": [{"role": "user", "content": "你好"}],
        "max_tokens": 32,
    },
    timeout=60.0,
)
print("費用 (USD):", r.headers.get("x-litellm-response-cost"))
print("usage:", r.json()["usage"])

從 body 的 `usage` 物件讀

Response body 裡的 usage 物件相容於 OpenAI 的格式。針對 Anthropic、OpenAI 系列模型，會額外帶上對應的欄位：

{
  "usage": {
    "prompt_tokens": 1024,
    "completion_tokens": 256,
    "total_tokens": 1280,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0
    },
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0
  }
}

欄位	適用模型	說明
`prompt_tokens` / `completion_tokens` / `total_tokens`	全部	OpenAI 標準欄位
`prompt_tokens_details.cached_tokens`	OpenAI 系	`prompt_tokens` 中由 prompt 快取提供的部分（已經算在 `prompt_tokens` 裡）
`cache_creation_input_tokens`	Anthropic 系	本次寫入 Anthropic prompt 快取的 token 數
`cache_read_input_tokens`	Anthropic 系	本次從 Anthropic prompt 快取讀出的 token 數
`completion_tokens_details.reasoning_tokens`	推理模型（如 `gpt-5`、`gpt-5-mini`）	內部推理用掉的 token

💡

Anthropic 與 OpenAI 對「命中快取的 token」算法不同：Anthropic 的 cache_creation_input_tokens 和 cache_read_input_tokens 是 prompt_tokens 以外獨立計；OpenAI 的 cached_tokens 是 prompt_tokens 的子集。body 會誠實地按各家原樣回，您依照當下呼叫的模型對應的欄位讀即可。

串流（streaming）回應

走 streaming 時，usage 物件只會在「您主動要求的情況下」出現在最後一個 chunk。請在請求中加上 stream_options.include_usage: true：

import httpx, json
 
with httpx.stream(
    "POST",
    "https://hnd1.aihub.zeabur.ai/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_ZEABUR_API_KEY"},
    json={
        "model": "claude-sonnet-4-5",
        "messages": [{"role": "user", "content": "你好"}],
        "max_tokens": 32,
        "stream": True,
        "stream_options": {"include_usage": True},
    },
    timeout=60.0,
) as r:
    final_usage = None
    for line in r.iter_lines():
        if not line.startswith("data: "):
            continue
        payload = line[len("data: "):]
        if payload == "[DONE]":
            break
        chunk = json.loads(payload)
        if chunk.get("usage"):
            final_usage = chunk["usage"]
        # ...您原本逐 chunk 的處理...
    print("最後一筆 usage:", final_usage)

如果沒帶 stream_options.include_usage，整段 streaming 回應中都不會有 usage 物件。

追蹤特定一次調用

若需於事後查詢某次特定請求——例如診斷異常回應、核對費用，或與應用端日誌關聯——請使用 response body 中的 id（completion ID）：

resp = client.chat.completions.create(...)
print("request id:", resp.id)
# chatcmpl-715764bf-675d-4d95-bbdc-80c037c8ce3f

⚠️

請始終使用 response.id，請勿使用 x-litellm-call-id header。Header 中的值為 LiteLLM 內部路由所使用的 trace 識別碼，不會寫入 spend log，以該值進行查詢將不會回傳任何結果。

在 Zeabur Dashboard 的 AI Hub 頁面中展開對應的每日記錄，點擊任一資料列右側的資訊圖示，詳情對話框將顯示 Request ID 並提供複製按鈕；該值與 response.id 完全一致。

Cache 命中的回應帶有衍生後綴

當 LiteLLM 直接以 response cache 中的內容回傳完整回應時（Dashboard 將顯示「整條回應來自快取」標籤，cost = 0），記錄中的 ID 為原始 ID 並附加 _cache_hit<timestamp> 後綴：

chatcmpl-2ec370dd-83a6-41b4-817c-1646c57034bc_cache_hit1779350469.4126954

原始請求與 cache 提供的回應在 spend log 中分別記錄為兩筆獨立資料，共用相同的 UUID 前綴。可依此前綴進行聚合，以統計特定 prompt 被快取提供的次數。

歷史紀錄

如果是要看每日累計、或翻每一筆請求的明細，Zeabur Dashboard 的 AI Hub 頁面已經提供每月總覽與單筆 spend log 查詢。本頁列出的 header 與 body 欄位是給您在程式中即時讀取的；事後查詢請直接到 Dashboard。