Skip to main content
We’re introducing the new Extract endpoint (/extract), a web data agent that turns a page into structured rows. Given a seed url and a natural-language query describing the rows you want, the agent extracts the matching records and returns them as a downloadable file. Asynchronous lifecycle. Optional row schema and URL verification. Extract is in closed beta and access is granted by request — request access to try it.

How to use

  1. Send a POST /extract request with a seed url and a q describing which rows to extract. The endpoint immediately returns a task identifier and status of "pending".
  2. Poll GET /extract/{id} to retrieve a specific task, or GET /extract to list all your extract tasks.
  3. While running, status is "pending" or "processing". Once the task is "completed", read output.resultUrl and download the rows. If "failed", inspect error.
Example Request
curl
curl --request POST \
  --url https://api.linkup.so/v1/extract \
  --header 'Authorization: Bearer {{LINKUP_API_KEY}}' \
  --header 'Content-Type: application/json' \
  --data '{
  "q": "All engineering team members with their name, role, and profile page",
  "url": "https://example.com/team",
  "schema": {
    "type": "object",
    "properties": {
      "name": { "type": "string" },
      "role": { "type": "string" },
      "profileUrl": { "type": "string", "format": "uri" }
    },
    "required": ["name", "profileUrl"]
  },
  "verifyUrls": true
}'

Parameters

ParameterDescription
qNatural-language query describing which rows to extract and what each row should contain. Required.
urlThe seed URL the extract task starts from. Required.
schemaOptional JSON schema describing a single extracted row. When provided, every returned row must match it.
verifyUrlsWhether URLs found in extracted rows are checked for reachability after extraction. Defaults to false.

Output

The extracted rows are not returned inline. Once "completed", output contains a resultUrl — a download link to a newline-delimited JSON (NDJSON) file, one row per line — valid for 24 hours. It also reports rowsReturned and creditsUsed, the credits used by the task. Read the full reference: POST /extract · GET /extract · GET /extract/:id. For operational guidance, see extract best practices.