POST
/
inference
/
batch
Batch Inference
curl --request POST \
  --url https://inference-service-433968519479.us-central1.run.app/inference/batch \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "hf_token": "<string>",
  "model_source": "<string>",
  "model_type": "adapter",
  "base_model_id": "<string>",
  "messages": [
    [
      {}
    ]
  ],
  "use_vllm": true
}'
{
  "results": [
    "<string>"
  ]
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
hf_token
string
required
model_source
string
required
model_type
enum<string>
required
Available options:
adapter,
merged,
base
base_model_id
string
required
messages
object[][]
required
use_vllm
boolean | null
default:false

Response

Successful Response

results
string[]
required