Batch inference

Batch Inference

curl --request POST \
  --url https://inference-service-433968519479.us-central1.run.app/inference/batch \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "hf_token": "<string>",
  "model_source": "<string>",
  "model_type": "adapter",
  "base_model_id": "<string>",
  "messages": [
    [
      {}
    ]
  ],
  "use_vllm": true
}'

{
  "results": [
    "<string>"
  ]
}

POST

inference

batch

Batch Inference

curl --request POST \
  --url https://inference-service-433968519479.us-central1.run.app/inference/batch \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "hf_token": "<string>",
  "model_source": "<string>",
  "model_type": "adapter",
  "base_model_id": "<string>",
  "messages": [
    [
      {}
    ]
  ],
  "use_vllm": true
}'

{
  "results": [
    "<string>"
  ]
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

hf_token

string

required

model_source

string

required

model_type

enum<string>

required

Available options:

adapter,

merged,

base

base_model_id

string

required

messages

object[][]

required

use_vllm

boolean | null

default:false

Response

Successful Response

results

string[]

required

Inference Evaluation

⌘I

API documentation

Dataset preprocessing

Training

Inference

Export

Authorizations

Body

Response