Fetch best practices

This page covers when to render JavaScript, when to extract raw HTML or image URLs, and how to pair Fetch with Search.

When to render JavaScript

Many modern sites load content via JavaScript. Within an agentic pipeline, setting renderJs to true is the safer default: it ensures the full content of the page is extracted and provided to the agent. Setting renderJs to false is appropriate when targeting a known set of static pages, once the specific site has been confirmed to return full content without JavaScript rendering. The default is false and the no-JS rate is cheaper ($0.001 vs $0.005 per call). Indicators that renderJs should be true:

The returned markdown is substantially shorter than the live page.
The output contains repeated boilerplate such as “Loading…” or “JavaScript is required”.
Sections visible in a browser are missing entirely from the returned markdown.

Pairing with Search

A common pattern uses Search to find candidate URLs and Fetch to retrieve them in full when the agent needs the entire page rather than the snippets returned by Search.

search_results = client.search(
    query="Datadog pricing tiers and per-host costs",
    depth="standard",
    output_type="searchResults",
)

for result in search_results.results[:3]:
    page = client.fetch(url=result.url, render_js=True)
    # feed page.markdown into your LLM, or extract specific fields directly

Selection of pages to fetch can be done:

agentic: ask the agent to fetch the most relevant pages based on the page snippets returned by Search.
programmatic: set maxResults in Search and fetch all URLs.

Working with raw HTML and images

Fetch returns clean markdown by default. Two flags add adjacent representations of the same page when the markdown alone is insufficient.

`extractImages`

Set extractImages to true to additionally return a list of image URLs found on the page (product photos, charts on a financial page, recipe images). Adds latency; enable only for workflows that consume image URLs.

`includeRawHtml`

Set includeRawHtml to true for:

workflows that need to operate on the full page HTML;
pages whose structure (complex tables, embedded widgets) is erased during markdown conversion.

Both flags default to false and can be combined with renderJs set to true.

Common pitfalls

Bad → Fix pairs grounded in the documented constraints of Fetch (HTML and PDF, 20 MB cap, anonymous, optional JavaScript rendering for HTML pages). SPA fetched without renderJs. The markdown comes back near-empty because the content is rendered client-side.

Bad

{ "url": "https://app.example.com/dashboard", "renderJs": false }

Fix

{ "url": "https://app.example.com/dashboard", "renderJs": true }

Passing an unsupported binary URL. Fetch supports HTML and PDF. ZIPs, images, videos, and other binary content return a 400 error.

Bad

{ "url": "https://example.com/archive.zip" }

Fix

{ "url": "https://example.com/whitepaper.pdf" }

Expecting Fetch to retrieve content behind a login wall. The endpoint is anonymous and returns what a logged-out visitor would see.

Bad

{ "url": "https://app.example.com/private/report" }

Fix

{ "url": "https://example.com/public/press-release" }

Get Started

Endpoints

Platform

Guides

Tutorials

FAQ

When to render JavaScript

Pairing with Search

Working with raw HTML and images

`extractImages`

`includeRawHtml`

Common pitfalls

Resources

​When to render JavaScript

​Pairing with Search

​Working with raw HTML and images

​extractImages

​includeRawHtml

​Common pitfalls

​Resources

When to render JavaScript

Pairing with Search

Working with raw HTML and images

`extractImages`

`includeRawHtml`

Common pitfalls

Resources