Document extraction
Extract structured data from Word documents
Upload Word documents and get structured data back. Define the fields you need — parties, dates, clauses, terms — and the API extracts them with confidence scores and citations.
How it works
Send your DOCX file
Upload via the API or pass a URL. The API auto-detects the format.
Define your schema
Describe the fields you want as a JSON schema. The API maps your document to your structure.
Get structured JSON
Receive typed data with confidence scores and citations back to the source document.
Example request
curl -X POST https://dev.thedrive.ai/api/v1/extract \
-H "X-API-Key: your_key" \
-F "file=@document.docx" \
-F 'schema={"parties": ["string"], "effective_date": "string", "terms": "string"}'
DOCX processing features
DOCX and DOC support
Handles both modern .docx and legacy .doc formats seamlessly.
Heading-aware extraction
Understands document structure — sections, headings, numbered lists, and nested content.
Table extraction
Tables in Word documents are parsed with row/column structure preserved.
Start extracting from DOCX files
Free tier includes 100 credits/month. No credit card required.