File & URL intelligence API

Send a file. Get answers.
One API call.

Send any file or URL with a schema describing what you need. Get structured answers back — computed, verified, cited. Cross-reference multiple documents. 107+ formats.

Free tier included. No credit card required.

npm install @thedriveai/sdk pip install thedriveai
analyze extract
node python curl
// Send a file + schema. Get verified, computed answers back.
import { TheDriveAI } from "@thedriveai/sdk";

const client = new TheDriveAI({ apiKey: "tda_live_..." });

const result = await client.analyze({
  file: "invoice.pdf",
  schema: {
    math_checks_out: { type: "boolean", description: "Do line items add up to the stated total?" },
    discrepancy:     { type: "number", description: "Dollar difference, if any" },
    correct_total:   { type: "number", description: "Recalculated total from line items + tax" },
  },
});

// → result.data.math_checks_out    → false
// → result.reasoning.math_checks_out → "$3,200 + $1,550 + $890 = $5,640, not $5,740"
// → result.data.discrepancy         → 100
// → result.data.correct_total       → 6091.20
# Send a file + schema. Get verified, computed answers back.
from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.analyze(
    file="invoice.pdf",
    schema={
        "math_checks_out": {"type": "boolean", "description": "Do line items add up to the stated total?"},
        "discrepancy":     {"type": "number", "description": "Dollar difference, if any"},
        "correct_total":   {"type": "number", "description": "Recalculated total from line items + tax"},
    },
)

# → result.data["math_checks_out"]    → False
# → result.reasoning["math_checks_out"] → "$3,200 + $1,550 + $890 = $5,640, not $5,740"
# → result.data["discrepancy"]         → 100
# → result.data["correct_total"]       → 6091.20
# Send a file + schema. Get verified, computed answers back.
curl -X POST https://dev.thedrive.ai/api/v1/analyze \
  -H "X-API-Key: tda_live_..." \
  -F file=@invoice.pdf \
  -F 'schema={
    "math_checks_out": {"type": "boolean", "description": "Do line items add up to the stated total?"},
    "discrepancy":     {"type": "number", "description": "Dollar difference, if any"},
    "correct_total":   {"type": "number", "description": "Recalculated total from line items + tax"}
  }'

# → {"data": {"math_checks_out": false, "discrepancy": 100, "correct_total": 6091.20},
#    "reasoning": {"math_checks_out": "$3,200 + $1,550 + $890 = $5,640, not $5,740", ...},
#    "confidence": {"math_checks_out": 0.97, "discrepancy": 0.97, "correct_total": 0.99}}
// Pull literal values from any file or URL. Fast path extraction.
import { TheDriveAI } from "@thedriveai/sdk";

const client = new TheDriveAI({ apiKey: "tda_live_..." });

const result = await client.extract({
  file: "contract.pdf",
  schema: {
    parties:        { type: "array", description: "All parties" },
    effective_date: { type: "string", description: "Start date (ISO 8601)" },
    liability_cap:  { type: "number", description: "Maximum liability", required: true },
  },
});

// → result.data.parties        → ["Acme Corp", "Globex Inc"]
// → result.data.liability_cap  → 500000
// → result.confidence.parties  → 0.96
// → result.citations.liability_cap → "...not exceed $500,000"
# Pull literal values from any file or URL. Fast path extraction.
from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.extract(
    file="contract.pdf",
    schema={
        "parties":        {"type": "array", "description": "All parties"},
        "effective_date": {"type": "string", "description": "Start date (ISO 8601)"},
        "liability_cap":  {"type": "number", "description": "Maximum liability", "required": True},
    },
)

# → result.data["parties"]        → ["Acme Corp", "Globex Inc"]
# → result.data["liability_cap"]  → 500000
# → result.confidence["parties"]  → 0.96
# → result.citations["liability_cap"] → "...not exceed $500,000"
# Pull literal values from any file or URL. Fast path extraction.
curl -X POST https://dev.thedrive.ai/api/v1/extract \
  -H "X-API-Key: tda_live_..." \
  -F file=@contract.pdf \
  -F 'schema={
    "parties":        {"type": "array", "description": "All parties"},
    "effective_date": {"type": "string", "description": "Start date (ISO 8601)"},
    "liability_cap":  {"type": "number", "description": "Maximum liability", "required": true}
  }'

# → {"data": {"parties": ["Acme Corp", "Globex Inc"],
#            "effective_date": "2025-01-15",
#            "liability_cap": 500000},
#    "confidence": {"parties": 0.96, "liability_cap": 0.93},
#    "citations": {"liability_cap": "...not exceed $500,000"}}

107+

file formats

1000+

page documents

Files + URLs

same endpoint

<3s

average response

Two modes, one schema

Extract data. Or compute answers. Same schema.

Analyze reads documents, computes answers, cross-checks numbers, and shows its reasoning. It can also cross-reference multiple documents in one call. Extract is the fast path — pull literal values when you just need fields. Same schema format for both.

Try it in the playground →

Analyze

POST /api/v1/analyze

The full reasoning pipeline — document navigation, sandboxed computation, cross-referencing, vision for charts and scans — in one API call. Define what you need. Get computed, verified answers back.

Computed answers — totals, growth rates, cross-checks are deterministic
Full reasoning trace — see exactly how each answer was derived
Cross-reference multiple documents — validate invoices against contracts
Handles 1000+ page documents and multi-page tables
Source citations for every answer — audit-ready output
schema: {
  "revenue_growth": {"type": "number", "description": "YoY growth rate"},
  "auto_renews": {"type": "boolean", "description": "Does this auto-renew?"},
  "line_items_match": {"type": "boolean", "description": "Do line items add up to total?"}
}
→ revenue_growth: -0.23 (computed from p.12 + p.47)
  auto_renews: true (Section 8.2, "shall automatically renew")
  line_items_match: false ($100 gap, computed)
Try Analyze →

Extract

POST /api/v1/extract

The fast path. Pulls literal values — names, dates, amounts, clauses — directly from the source. When you just need fields, not reasoning.

Typed fields — strings, numbers, booleans, arrays, enums
Required fields with partial result support
Confidence scores per field (high / medium / low)
Source citations — the exact text that was used
schema: {
  "vendor": {"type": "string", "description": "Company name"},
  "total": {"type": "number", "description": "Total amount due"},
  "status": {"type": "string", "enum": ["paid","unpaid"]}
}
→ vendor: "AWS", total: 971.73,
  status: "paid"
Try Extract →

See it work

Extract data. Verify it. Cross-reference it.

Same API call regardless of industry. Extract pulls literal values. Analyze computes, cross-checks, and catches what extraction misses.

Try it yourself →

Invoice verification

POST /api/v1/analyze

Don't just read the total — verify it. The API sums line items, checks the tax calculation, and catches discrepancies that extraction alone would miss.

schema: {
  "math_checks_out": {
    "type": "boolean",
    "description": "Do line items add up to the stated total?"
  },
  "discrepancy": {
    "type": "number",
    "description": "Dollar difference between computed and stated total"
  },
  "tax_correct": {
    "type": "boolean",
    "description": "Is the tax calculated correctly at the stated rate?"
  }
}

Response

{
  "data": {
    "math_checks_out": false,
    "discrepancy": 100.00,
    "tax_correct": false
  },
  "confidence": {
    "math_checks_out": 0.99,
    "discrepancy": 1.0,
    "tax_correct": 0.98
  },
  "reasoning": {
    "math_checks_out": "$3,200 + $1,550 + $890 = $5,640.
      Stated subtotal: $5,740. Difference: $100.",
    "discrepancy": "Computed sum $5,640 vs stated $5,740.",
    "tax_correct": "Tax should be $451.20 (8% of $5,640),
      not $459.20. The $100 subtotal error cascades."
  }
}

What you'd otherwise build

Months of engineering. Or one API call.

Reliable document intelligence isn't a weekend project. It's agent design, prompt engineering, parser selection across hundreds of formats, accuracy benchmarking, sandboxed computation, vision fallbacks, retry logic, and ongoing maintenance as models change. We've done the research and engineering. You get the result.

Agent architecture, handled

Multi-step reasoning, tool selection, progressive document reading, retry strategies — we designed and benchmarked the agent so you don't have to.

Math that's computed, not guessed

Totals, growth rates, cross-checks — every number is calculated deterministically in a sandboxed environment. Not guessed by an LLM.

107+ formats, one call

PDF, DOCX, XLSX, images, audio, video, websites — each format has its own parsing path, benchmarked for accuracy. You just send the file.

OCR + vision proofreading

Scanned contracts, phone photos, faxed forms — OCR extracts text, vision models proofread it. Not one or the other. Both, verified against each other.

1000+ pages, multi-page tables

Progressive reading navigates long documents without losing context. Tables that span pages are reassembled correctly. Returns in seconds.

Every answer is auditable

Confidence scores, source citations, and full reasoning traces for every field. Know exactly how each answer was derived and where it came from.

Website intelligence

Works on URLs too. Same API call.

Pass a URL instead of a file. The API launches a headless browser, executes JavaScript, waits for dynamic content to load, then extracts structured data from the live DOM. React SPAs, server-rendered pages, dynamic dashboards — same result.

Enable follow_links to automatically discover and crawl relevant subpages — /about, /pricing, /contact — and fill gaps in your schema from multiple pages in one call. The API decides which links are worth following based on your schema.

Try with a URL →
node python curl
const result = await client.extract({
  url: "https://linear.app",
  followLinks: true,
  schema: {
    name:     { type: "string", description: "Company name" },
    pricing:  { type: "array",  description: "Plan names and prices" },
    logo:     { type: "string", description: "Logo URL" },
    colors:   { type: "array",  description: "Brand colors" },
    founders: { type: "array",  description: "Founder names" },
  },
});
result = client.extract(
    url="https://linear.app",
    follow_links=True,
    schema={
        "name":     {"type": "string", "description": "Company name"},
        "pricing":  {"type": "array",  "description": "Plan names and prices"},
        "logo":     {"type": "string", "description": "Logo URL"},
        "colors":   {"type": "array",  "description": "Brand colors"},
        "founders": {"type": "array",  "description": "Founder names"},
    },
)
curl -X POST https://dev.thedrive.ai/api/v1/extract \
  -H "X-API-Key: tda_live_..." \
  -F "url=https://linear.app" \
  -F "follow_links=true" \
  -F 'schema={
    "name":     {"type": "string", "description": "Company name"},
    "pricing":  {"type": "array",  "description": "Plan names and prices"},
    "logo":     {"type": "string", "description": "Logo URL"},
    "colors":   {"type": "array",  "description": "Brand colors"},
    "founders": {"type": "array",  "description": "Founder names"}
  }'

Brand extraction

Logo URL, brand colors, fonts, social profiles — extracted from the rendered DOM, not guessed from raw HTML. Handles JS-heavy SPAs, lazy-loaded content, and dynamically injected elements.

Multi-page research

With follow_links, the API intelligently discovers subpages relevant to your schema — /pricing, /about, /team — and aggregates data across them in one call.

Lead enrichment

Turn a prospect URL into CRM-ready structured data. The API renders the full site, follows relevant pages, and returns exactly the fields your schema defines.

Competitive monitoring

Run the same schema against competitor URLs on a schedule. Track pricing changes, new features, positioning shifts — get structured diffs, not page screenshots.

How it works

One endpoint. Three capabilities.

Define a schema describing what you need. The API handles parsing, reasoning, computation, and verification. Call it directly from your code, or wire it into an agent framework as a tool.

Extract from any file

One schema, any format. The API selects the right parsing pipeline — text extraction, table reconstruction, OCR, vision — and returns the same typed JSON regardless of input.

invoice.pdf → {"vendor": "...", "total": 971.73}

receipt.jpg → {"vendor": "...", "total": 42.50}

scan.heic → {"vendor": "...", "total": 188.00}

Compute and verify

The API runs multi-step reasoning — navigating documents, executing math in a sandboxed environment, cross-checking values — and returns the full reasoning trace for every answer.

"Do line items match the total?" → false ($100 gap)

"Is contract still active?" → true (216 days left)

"Debt-to-equity ratio?" → 1.34 (computed)

Cross-reference documents

Send multiple files in one call. Validate an invoice against a contract. Check that a report's numbers match a source spreadsheet. Reconcile data from different sources.

invoice.pdf + contract.pdf

→ "rates_match": false ($150/hr vs $125/hr)

→ "correct_vendor": true

Formats

107+ file types

Each format has its own parsing pipeline — content detection, format-specific extraction, table reconstruction, OCR with vision proofreading for scans. You just send the file.

Pricing

Pay per call

Usage-based. Free tier to start. No minimum commitment.

extract

1 credit/page

analyze

2 credits/page

cross-analyze

5/doc + 3/page

audio/video transcription

1 credit/min

markdown, screenshots

1 credit each

Free

$0

forever

100 credits/month

30 requests/min

All endpoints

Get started

Pro

$0.01

per credit

Pay as you go

120 requests/min

Priority support

Start building

Enterprise

Custom

volume pricing

600 requests/min

SLA guarantee

Dedicated support

Contact us

Send a file. Get answers. Start now.

Get an API key and extract structured data from your first document in under a minute.