Video transcription

Transcribe and extract data from video files

Upload video files — recorded meetings, presentations, tutorials, webinars — and get structured transcriptions with speaker identification, visual context, and schema-based extraction.

How it works

1

Send your MP4 / MOV file

Upload via the API or pass a URL. The API auto-detects the format.

2

Define your schema

Describe the fields you want as a JSON schema. The API maps your document to your structure.

3

Get structured JSON

Receive typed data with confidence scores and citations back to the source document.

Example request

curl -X POST https://dev.thedrive.ai/api/v1/extract \
  -H "X-API-Key: your_key" \
  -F "file=@document.mp4 / mov" \
  -F 'schema={"transcript": "string", "speakers": ["string"], "topics": ["string"], "key_moments": [{"time": "string", "description": "string"}]}'

MP4 / MOV processing features

Multi-format support

MP4, MOV, WebM, AVI, MKV, WMV, FLV, 3GP, and M4V.

Audio track extraction

Extracts and transcribes the audio track from any video format.

Speaker diarization

Identifies speakers and labels each segment of dialogue.

Schema-based extraction

Extract topics, decisions, action items, or any custom fields from the video content.

Long video support

Handles hour-long videos. Billed at 1 credit per minute of audio.

Timestamped output

Each transcript segment includes precise timestamps for reference.

Start extracting from MP4 / MOV files

Free tier includes 100 credits/month. No credit card required.