Skip to main content
POST
/
v1
/
transcribe-file
Transcribe from file upload
curl --request POST \
  --url https://api.sky-scribe.com/v1/transcribe-file \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form 'language=<string>' \
  --form format=json \
  --form 'callback=<string>' \
  --form diarize=true \
  --form 'team_id=<string>' \
  --form 'folder_id=<string>' \
  --form 'output_config={
  "include_timestamps": true,
  "include_speakers": true
}'
{
  "status": "completed",
  "id": "<string>",
  "data": {
    "json": {
      "language": "<string>",
      "text": "<string>",
      "segments": [
        {
          "id": "<string>",
          "start": 123,
          "end": 123,
          "text": "<string>",
          "speaker": "<string>"
        }
      ]
    },
    "doc": {
      "url": "<string>"
    },
    "pdf": {
      "url": "<string>"
    },
    "txt": {
      "url": "<string>"
    },
    "markdown": "<string>",
    "srt": {
      "url": "<string>"
    },
    "vtt": {
      "url": "<string>"
    },
    "csv": {
      "url": "<string>"
    }
  }
}
Upload an audio or video file directly for transcription. Supports common formats including MP3, MP4, WAV, and more.
For URL-based transcription instead, see Transcribe from URL.

Supported Formats

  • Audio: MP3, WAV, M4A, FLAC, OGG
  • Video: MP4, MOV, AVI, MKV, WebM

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data
file
file
required

Audio or video file to transcribe.

language
string

(Optional) Language hint (BCP-47 code). Defaults to 'auto'.

format
enum<string>[]

(Optional) Output formats to generate. Available: doc, pdf, txt, markdown, srt, vtt, csv, json. Defaults to ["json"] if not provided.

Available options:
doc,
pdf,
txt,
markdown,
srt,
vtt,
csv,
json
callback
string<uri>

(Optional) Webhook URL for async processing. If provided, the job runs asynchronously and results are sent to this URL when complete. Without callback, the endpoint processes synchronously.

diarize
boolean

(Optional) Enable speaker diarization.

team_id
string

(Optional) Team identifier. If not provided, uses your default workspace.

folder_id
string

(Optional) Folder identifier. If not provided, file will be added to the root of the workspace.

output_config
object

(Optional) Output configuration settings.

Response

Transcription response. Returns one of two response types:

Option 1 - Completed (status: "completed"): Returned when no callback URL is provided (synchronous processing). Response includes status, id, and data with all requested formats (defaults to json if not specified).

Option 2 - Pending (status: "pending"): Only returned when callback URL is provided (asynchronous processing). Response includes status, id, and poll_url to check progress. Results will be sent to the callback URL when complete.

status
enum<string>
required
Available options:
completed
id
string
required
data
object
required

Map of requested formats to their data. Only formats specified in the request will be present. If no formats were specified, defaults to json only.