Video transcription
Transcribe and extract data from video files
Upload video files — recorded meetings, presentations, tutorials, webinars — and get structured transcriptions with speaker identification, visual context, and schema-based extraction.
How it works
Send your MP4 / MOV file
Upload via the API or pass a URL. The API auto-detects the format.
Define your schema
Describe the fields you want as a JSON schema. The API maps your document to your structure.
Get structured JSON
Receive typed data with confidence scores and citations back to the source document.
Example request
curl -X POST https://dev.thedrive.ai/api/v1/extract \
-H "X-API-Key: your_key" \
-F "file=@document.mp4 / mov" \
-F 'schema={"transcript": "string", "speakers": ["string"], "topics": ["string"], "key_moments": [{"time": "string", "description": "string"}]}'
MP4 / MOV processing features
Multi-format support
MP4, MOV, WebM, AVI, MKV, WMV, FLV, 3GP, and M4V.
Audio track extraction
Extracts and transcribes the audio track from any video format.
Speaker diarization
Identifies speakers and labels each segment of dialogue.
Schema-based extraction
Extract topics, decisions, action items, or any custom fields from the video content.
Long video support
Handles hour-long videos. Billed at 1 credit per minute of audio.
Timestamped output
Each transcript segment includes precise timestamps for reference.
Start extracting from MP4 / MOV files
Free tier includes 100 credits/month. No credit card required.