Why not just send a PDF to GPT or Claude directly?

Sending a PDF to an LLM works for demos. In production you need scanned document handling, 500-page navigation, tables that parse correctly, and math that doesn't hallucinate. Building that yourself means months of engineering: designing an agent architecture, prompt engineering across document types, benchmarking accuracy, building a code sandbox for deterministic math, implementing OCR with vision proofreading, handling 107+ file formats, and maintaining it all as models change. The Drive AI is the result of that engineering, shipped as one API call.

What is the difference between Extract and Analyze?

Extract pulls literal values (names, dates, amounts) from documents — the fast path when you just need fields. Analyze is the intelligence layer: it reads, computes, cross-checks, and verifies. It catches the $100 invoice discrepancy, calculates whether a contract is still active, and shows its math. Analyze can also cross-reference multiple documents — validate an invoice against a contract, reconcile data from different sources. Think of Extract as reading and Analyze as reasoning. Extract costs 1 credit per page, Analyze costs 2 credits per page.

How much does The Drive AI cost?

The Drive AI offers a free tier with 100 credits/month (no credit card required), a Pro tier at $0.01 per credit (pay as you go, 120 requests/min), and custom Enterprise pricing with SLA guarantees and 600 requests/min.

Can The Drive AI process websites and URLs?

Yes. Pass a URL instead of a file to the same endpoint. The API renders the page in a headless browser, runs JavaScript, then extracts data from the live DOM. Enable follow_links to automatically crawl subpages and fill gaps in your schema from multiple pages in one call.

What can I extract from websites using The Drive AI?

You can extract any structured data from websites — company info, product catalogs, pricing tiers, contact details, social links, tech stack, team pages, and more. The API renders JavaScript-heavy SPAs in a headless browser, so you get data from the live DOM, not raw HTML. Common use cases include competitor monitoring, lead enrichment, e-commerce scraping, and job board extraction.

File & URL intelligence API

Send a file. Get answers.
One API call.

Send any file or URL with a schema describing what you need. Get structured answers back — computed, verified, cited. Cross-reference multiple documents. 107+ formats.

Get API Key Read the Docs

Free tier included. No credit card required.

npm install @thedriveai/sdk pip install thedriveai

analyze extract

node python curl

// Send a file + schema. Get verified, computed answers back.
import { TheDriveAI } from "@thedriveai/sdk";

const client = new TheDriveAI({ apiKey: "tda_live_..." });

const result = await client.analyze({
  file: "invoice.pdf",
  schema: {
    math_checks_out: { type: "boolean", description: "Do line items add up to the stated total?" },
    discrepancy:     { type: "number", description: "Dollar difference, if any" },
    correct_total:   { type: "number", description: "Recalculated total from line items + tax" },
  },
});

// → result.data.math_checks_out    → false
// → result.reasoning.math_checks_out → "$3,200 + $1,550 + $890 = $5,640, not $5,740"
// → result.data.discrepancy         → 100
// → result.data.correct_total       → 6091.20

# Send a file + schema. Get verified, computed answers back.
from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.analyze(
    file="invoice.pdf",
    schema={
        "math_checks_out": {"type": "boolean", "description": "Do line items add up to the stated total?"},
        "discrepancy":     {"type": "number", "description": "Dollar difference, if any"},
        "correct_total":   {"type": "number", "description": "Recalculated total from line items + tax"},
    },
)

# → result.data["math_checks_out"]    → False
# → result.reasoning["math_checks_out"] → "$3,200 + $1,550 + $890 = $5,640, not $5,740"
# → result.data["discrepancy"]         → 100
# → result.data["correct_total"]       → 6091.20

# Send a file + schema. Get verified, computed answers back.
curl -X POST https://dev.thedrive.ai/api/v1/analyze \
  -H "X-API-Key: tda_live_..." \
  -F file=@invoice.pdf \
  -F 'schema={
    "math_checks_out": {"type": "boolean", "description": "Do line items add up to the stated total?"},
    "discrepancy":     {"type": "number", "description": "Dollar difference, if any"},
    "correct_total":   {"type": "number", "description": "Recalculated total from line items + tax"}
  }'

# → {"data": {"math_checks_out": false, "discrepancy": 100, "correct_total": 6091.20},
#    "reasoning": {"math_checks_out": "$3,200 + $1,550 + $890 = $5,640, not $5,740", ...},
#    "confidence": {"math_checks_out": 0.97, "discrepancy": 0.97, "correct_total": 0.99}}

// Pull literal values from any file or URL. Fast path extraction.
import { TheDriveAI } from "@thedriveai/sdk";

const client = new TheDriveAI({ apiKey: "tda_live_..." });

const result = await client.extract({
  file: "contract.pdf",
  schema: {
    parties:        { type: "array", description: "All parties" },
    effective_date: { type: "string", description: "Start date (ISO 8601)" },
    liability_cap:  { type: "number", description: "Maximum liability", required: true },
  },
});

// → result.data.parties        → ["Acme Corp", "Globex Inc"]
// → result.data.liability_cap  → 500000
// → result.confidence.parties  → 0.96
// → result.citations.liability_cap → "...not exceed $500,000"

# Pull literal values from any file or URL. Fast path extraction.
from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.extract(
    file="contract.pdf",
    schema={
        "parties":        {"type": "array", "description": "All parties"},
        "effective_date": {"type": "string", "description": "Start date (ISO 8601)"},
        "liability_cap":  {"type": "number", "description": "Maximum liability", "required": True},
    },
)

# → result.data["parties"]        → ["Acme Corp", "Globex Inc"]
# → result.data["liability_cap"]  → 500000
# → result.confidence["parties"]  → 0.96
# → result.citations["liability_cap"] → "...not exceed $500,000"

# Pull literal values from any file or URL. Fast path extraction.
curl -X POST https://dev.thedrive.ai/api/v1/extract \
  -H "X-API-Key: tda_live_..." \
  -F file=@contract.pdf \
  -F 'schema={
    "parties":        {"type": "array", "description": "All parties"},
    "effective_date": {"type": "string", "description": "Start date (ISO 8601)"},
    "liability_cap":  {"type": "number", "description": "Maximum liability", "required": true}
  }'

# → {"data": {"parties": ["Acme Corp", "Globex Inc"],
#            "effective_date": "2025-01-15",
#            "liability_cap": 500000},
#    "confidence": {"parties": 0.96, "liability_cap": 0.93},
#    "citations": {"liability_cap": "...not exceed $500,000"}}

107+

file formats

1000+

page documents

Files + URLs

same endpoint

<3s

average response

Two modes, one schema

Extract data. Or compute answers. Same schema.

Analyze reads documents, computes answers, cross-checks numbers, and shows its reasoning. It can also cross-reference multiple documents in one call. Extract is the fast path — pull literal values when you just need fields. Same schema format for both.

Try it in the playground →

Analyze

POST /api/v1/analyze

The full reasoning pipeline — document navigation, sandboxed computation, cross-referencing, vision for charts and scans — in one API call. Define what you need. Get computed, verified answers back.

Computed answers — totals, growth rates, cross-checks are deterministic

Full reasoning trace — see exactly how each answer was derived

Cross-reference multiple documents — validate invoices against contracts

Handles 1000+ page documents and multi-page tables

Source citations for every answer — audit-ready output

schema: {
  "revenue_growth": {"type": "number", "description": "YoY growth rate"},
  "auto_renews": {"type": "boolean", "description": "Does this auto-renew?"},
  "line_items_match": {"type": "boolean", "description": "Do line items add up to total?"}
}
→ revenue_growth: -0.23 (computed from p.12 + p.47)
  auto_renews: true (Section 8.2, "shall automatically renew")
  line_items_match: false ($100 gap, computed)

Try Analyze →

Extract

POST /api/v1/extract

The fast path. Pulls literal values — names, dates, amounts, clauses — directly from the source. When you just need fields, not reasoning.

Typed fields — strings, numbers, booleans, arrays, enums

Required fields with partial result support

Confidence scores per field (high / medium / low)

Source citations — the exact text that was used

schema: {
  "vendor": {"type": "string", "description": "Company name"},
  "total": {"type": "number", "description": "Total amount due"},
  "status": {"type": "string", "enum": ["paid","unpaid"]}
}
→ vendor: "AWS", total: 971.73,
  status: "paid"

Try Extract →

See it work

Extract data. Verify it. Cross-reference it.

Same API call regardless of industry. Extract pulls literal values. Analyze computes, cross-checks, and catches what extraction misses.

Try it yourself →

Invoice verification

POST /api/v1/analyze

Don't just read the total — verify it. The API sums line items, checks the tax calculation, and catches discrepancies that extraction alone would miss.

schema: {
  "math_checks_out": {
    "type": "boolean",
    "description": "Do line items add up to the stated total?"
  },
  "discrepancy": {
    "type": "number",
    "description": "Dollar difference between computed and stated total"
  },
  "tax_correct": {
    "type": "boolean",
    "description": "Is the tax calculated correctly at the stated rate?"
  }
}

Response

{
  "data": {
    "math_checks_out": false,
    "discrepancy": 100.00,
    "tax_correct": false
  },
  "confidence": {
    "math_checks_out": 0.99,
    "discrepancy": 1.0,
    "tax_correct": 0.98
  },
  "reasoning": {
    "math_checks_out": "$3,200 + $1,550 + $890 = $5,640.
      Stated subtotal: $5,740. Difference: $100.",
    "discrepancy": "Computed sum $5,640 vs stated $5,740.",
    "tax_correct": "Tax should be $451.20 (8% of $5,640),
      not $459.20. The $100 subtotal error cascades."
  }
}

Invoice vs. contract validation

POST /api/v1/analyze/cross

Upload an invoice and its source contract together. The API reads both documents, cross-references rates, vendor details, and payment terms — and catches mismatches that no single-document analysis would find.

files: [invoice.pdf, contract.pdf]

schema: {
  "rates_match": {
    "type": "boolean",
    "description": "Do hourly rates on the invoice match the contract?"
  },
  "correct_vendor": {
    "type": "boolean",
    "description": "Is the invoice vendor the contracting party?"
  },
  "within_budget": {
    "type": "boolean",
    "description": "Is the invoice total within the contract's budget cap?"
  },
  "overbilled_amount": {
    "type": "number",
    "description": "Amount overbilled based on contract rates, if any"
  }
}

Response

{
  "data": {
    "rates_match": false,
    "correct_vendor": true,
    "within_budget": false,
    "overbilled_amount": 4500.00
  },
  "confidence": {
    "rates_match": 0.98, "correct_vendor": 0.99,
    "within_budget": 0.97, "overbilled_amount": 0.96
  },
  "reasoning": {
    "rates_match": "Invoice bills Senior Engineer at $175/hr.
      Contract (Section 4.1) specifies $150/hr. $25/hr
      overcharge.",
    "correct_vendor": "Invoice from 'Apex Consulting LLC'
      matches contracting party in contract header.",
    "within_budget": "Invoice total: $47,250. Contract budget cap
      (Section 6.2): $40,000. Exceeds by $7,250.",
    "overbilled_amount": "180 hrs × $25/hr rate difference =
      $4,500 overbilled based on contracted rates."
  },
  "sources": {
    "rates_match": ["[invoice] Line 3",
      "[contract] Section 4.1"]
  }
}

Contract reasoning

POST /api/v1/analyze

Go beyond pulling clause text. The API reads the full contract, cross-references sections, computes deadlines from today's date, and identifies what matters.

schema: {
  "still_active": {
    "type": "boolean",
    "description": "Is this contract currently in effect?"
  },
  "days_until_expiry": {
    "type": "integer",
    "description": "Days remaining until contract expires"
  },
  "total_liability_exposure": {
    "type": "number",
    "description": "Maximum combined liability across all clauses"
  },
  "auto_renews": {
    "type": "boolean",
    "description": "Will this contract auto-renew?"
  }
}

Response

{
  "data": {
    "still_active": true,
    "days_until_expiry": 216,
    "total_liability_exposure": 750000,
    "auto_renews": true
  },
  "confidence": {
    "still_active": 0.97, "days_until_expiry": 0.95,
    "total_liability_exposure": 0.92, "auto_renews": 0.98
  },
  "reasoning": {
    "still_active": "Effective date Jan 15, 2025 with 24-month
      term. Expiry: Jan 15, 2027. Currently active.",
    "days_until_expiry": "Computed: Jan 15, 2027 minus today.",
    "total_liability_exposure": "Section 8.1 caps general
      liability at $500,000. Section 8.3 adds $250,000
      for IP indemnification. Combined: $750,000.",
    "auto_renews": "Section 2.2: 'shall automatically renew
      for successive 12-month periods'"
  }
}

Claims, lab results, patient forms

POST /api/v1/extract

Pull structured data from scanned medical forms. Vision mode handles handwriting, stamps, and low-quality scans with OCR + AI proofreading.

schema: {
  "patient_name": {"type": "string", "description": "Full name"},
  "diagnosis_code": {"type": "string", "description": "ICD-10 code"},
  "procedure": {"type": "string", "description": "Description"},
  "claim_amount": {"type": "number", "description": "Total billed"}
}

Response

{
  "data": {
    "patient_name": "Robert Chen",
    "diagnosis_code": "J06.9",
    "procedure": "Upper respiratory infection, acute",
    "claim_amount": 1250.00
  },
  "confidence": {
    "patient_name": 0.72,
    "diagnosis_code": 0.95,
    "procedure": 0.93,
    "claim_amount": 0.91
  },
  "extraction_method": "ocr+vision"
}

Listings, appraisals, inspections

POST /api/v1/extract

Extract property details from listing PDFs, appraisal reports, or live property URLs. Works on documents and websites with the same schema.

schema: {
  "address": {"type": "string", "description": "Property address"},
  "price": {"type": "number", "description": "Listing or assessed price"},
  "sqft": {"type": "integer", "description": "Square footage"},
  "year_built": {"type": "integer", "description": "Construction year"},
  "zoning": {"type": "string", "description": "Zoning classification"}
}

Response

{
  "data": {
    "address": "742 Evergreen Terrace, Springfield",
    "price": 425000,
    "sqft": 2150,
    "year_built": 1987,
    "zoning": null
  },
  "confidence": {
    "address": 0.97, "price": 0.96,
    "sqft": 0.94, "year_built": 0.71,
    "zoning": 0.93
  },
  "citations": {
    "price": "List Price: $425,000",
    "sqft": "Living Area: 2,150 sq ft"
  }
}

What you'd otherwise build

Months of engineering. Or one API call.

Reliable document intelligence isn't a weekend project. It's agent design, prompt engineering, parser selection across hundreds of formats, accuracy benchmarking, sandboxed computation, vision fallbacks, retry logic, and ongoing maintenance as models change. We've done the research and engineering. You get the result.

Agent architecture, handled

Multi-step reasoning, tool selection, progressive document reading, retry strategies — we designed and benchmarked the agent so you don't have to.

Math that's computed, not guessed

Totals, growth rates, cross-checks — every number is calculated deterministically in a sandboxed environment. Not guessed by an LLM.

107+ formats, one call

PDF, DOCX, XLSX, images, audio, video, websites — each format has its own parsing path, benchmarked for accuracy. You just send the file.

OCR + vision proofreading

Scanned contracts, phone photos, faxed forms — OCR extracts text, vision models proofread it. Not one or the other. Both, verified against each other.

1000+ pages, multi-page tables

Progressive reading navigates long documents without losing context. Tables that span pages are reassembled correctly. Returns in seconds.

Every answer is auditable

Confidence scores, source citations, and full reasoning traces for every field. Know exactly how each answer was derived and where it came from.

Website intelligence

Works on URLs too. Same API call.

Pass a URL instead of a file. The API launches a headless browser, executes JavaScript, waits for dynamic content to load, then extracts structured data from the live DOM. React SPAs, server-rendered pages, dynamic dashboards — same result.

Enable follow_links to automatically discover and crawl relevant subpages — /about, /pricing, /contact — and fill gaps in your schema from multiple pages in one call. The API decides which links are worth following based on your schema.

Try with a URL →

node python curl

const result = await client.extract({
  url: "https://linear.app",
  followLinks: true,
  schema: {
    name:     { type: "string", description: "Company name" },
    pricing:  { type: "array",  description: "Plan names and prices" },
    logo:     { type: "string", description: "Logo URL" },
    colors:   { type: "array",  description: "Brand colors" },
    founders: { type: "array",  description: "Founder names" },
  },
});

result = client.extract(
    url="https://linear.app",
    follow_links=True,
    schema={
        "name":     {"type": "string", "description": "Company name"},
        "pricing":  {"type": "array",  "description": "Plan names and prices"},
        "logo":     {"type": "string", "description": "Logo URL"},
        "colors":   {"type": "array",  "description": "Brand colors"},
        "founders": {"type": "array",  "description": "Founder names"},
    },
)

curl -X POST https://dev.thedrive.ai/api/v1/extract \
  -H "X-API-Key: tda_live_..." \
  -F "url=https://linear.app" \
  -F "follow_links=true" \
  -F 'schema={
    "name":     {"type": "string", "description": "Company name"},
    "pricing":  {"type": "array",  "description": "Plan names and prices"},
    "logo":     {"type": "string", "description": "Logo URL"},
    "colors":   {"type": "array",  "description": "Brand colors"},
    "founders": {"type": "array",  "description": "Founder names"}
  }'

Brand extraction

Logo URL, brand colors, fonts, social profiles — extracted from the rendered DOM, not guessed from raw HTML. Handles JS-heavy SPAs, lazy-loaded content, and dynamically injected elements.

Multi-page research

With follow_links, the API intelligently discovers subpages relevant to your schema — /pricing, /about, /team — and aggregates data across them in one call.

Lead enrichment

Turn a prospect URL into CRM-ready structured data. The API renders the full site, follows relevant pages, and returns exactly the fields your schema defines.

Competitive monitoring

Run the same schema against competitor URLs on a schedule. Track pricing changes, new features, positioning shifts — get structured diffs, not page screenshots.

How it works

One endpoint. Three capabilities.

Define a schema describing what you need. The API handles parsing, reasoning, computation, and verification. Call it directly from your code, or wire it into an agent framework as a tool.

Extract from any file

One schema, any format. The API selects the right parsing pipeline — text extraction, table reconstruction, OCR, vision — and returns the same typed JSON regardless of input.

invoice.pdf → {"vendor": "...", "total": 971.73}

receipt.jpg → {"vendor": "...", "total": 42.50}

scan.heic → {"vendor": "...", "total": 188.00}

Compute and verify

The API runs multi-step reasoning — navigating documents, executing math in a sandboxed environment, cross-checking values — and returns the full reasoning trace for every answer.

"Do line items match the total?" → false ($100 gap)

"Is contract still active?" → true (216 days left)

"Debt-to-equity ratio?" → 1.34 (computed)

Cross-reference documents

Send multiple files in one call. Validate an invoice against a contract. Check that a report's numbers match a source spreadsheet. Reconcile data from different sources.

invoice.pdf + contract.pdf

→ "rates_match": false ($150/hr vs $125/hr)

→ "correct_vendor": true

API

Analyze

POST /api/v1/analyze

Multi-step reasoning with sandboxed computation. Navigates long documents, cross-checks numbers, catches discrepancies, and returns every answer with sources, reasoning traces, and confidence scores.

2 credits/page · 10 credits/site

Playground →

Extract

POST /api/v1/extract

Schema in, structured data out. Any file or URL. Confidence scores, source citations. The fast path when you just need fields.

1 credit/page · 5 credits/site

Playground →

Cross-Analyze

POST /api/v1/analyze/cross

Send 2-5 documents. The API reasons across all of them — validating invoices against contracts, reconciling reports against source data. Each document gets its own index.

5 credits/doc + 3 credits/page

Playground →

Markdown

GET /md/{'{url}'}

Convert any URL, document, or audio/video to clean markdown. Audio transcribed via Whisper. JavaScript rendered, boilerplate stripped.

Playground →

Screenshot

GET /{'{url}'}

JPEG, GIF, or MP4 of any URL. Dark mode, full page, custom viewports.

Playground →

Thumbnails

POST /api/v1/thumbnails

Preview images from 107+ file types. PDFs, spreadsheets, presentations, code files.

Formats

107+ file types

Each format has its own parsing pipeline — content detection, format-specific extraction, table reconstruction, OCR with vision proofreading for scans. You just send the file.

Documents

PDF DOCX DOC ODT RTF PAGES EPUB TXT

Spreadsheets

XLSX XLS ODS CSV TSV NUMBERS

Presentations

PPTX PPT ODP KEY

Images

JPG PNG GIF WebP SVG TIFF BMP HEIC

Video & Audio

MP4 MOV WebM AVI MP3 WAV M4A FLAC OGG

Code & Data

JSON XML YAML HTML PY JS TS GO + 30 more

Use cases

What people build with this

From invoice processing to competitor monitoring — one API for files, URLs, and cross-document validation.

Invoice extraction

Vendor, line items, totals, tax — from PDF, image, or email invoices.

Learn more →

Contract analysis

Parties, dates, clauses, obligations — from agreements and NDAs.

Learn more →

Receipt scanning

Merchant, items, tax, total — from photos and digital receipts.

Learn more →

Medical documents

Patient info, diagnoses, procedure codes — from claims and lab reports.

Learn more →

Real estate

Property details, pricing, terms — from listings and appraisals.

Learn more →

Website extraction

Company info, products, pricing — from any rendered webpage.

Learn more →

Competitor monitoring

Track pricing, features, and hiring signals from competitor sites.

Learn more →

Lead enrichment

Company details, tech stack, contacts — from prospect websites.

Learn more →

Pricing

Pay per call

Usage-based. Free tier to start. No minimum commitment.

extract

1 credit/page

analyze

2 credits/page

cross-analyze

5/doc + 3/page

audio/video transcription

1 credit/min

markdown, screenshots

1 credit each

Free

forever

100 credits/month

30 requests/min

All endpoints

Get started

Pro

$0.01

per credit

Pay as you go

120 requests/min

Priority support

Start building

Enterprise

Custom

volume pricing

600 requests/min

SLA guarantee

Dedicated support

Send a file. Get answers. Start now.

Get an API key and extract structured data from your first document in under a minute.

Get API Key Playground

Send a file. Get answers.One API call.

Extract data. Or compute answers. Same schema.

Extract data. Verify it. Cross-reference it.

Months of engineering. Or one API call.

Works on URLs too. Same API call.

One endpoint. Three capabilities.

107+ file types

What people build with this

Pay per call

Send a file. Get answers. Start now.

Send a file. Get answers.
One API call.