Skip to main content

Generic REST API

info

The Generic REST API source lets you pull data from any basic HTTP API by defining one or more streams with:

  • request configuration (URL, method, headers/query/body)
  • response parsing paths
  • extract mode (full refresh, incremental, or date partition)
  • pagination strategy
  • output schema

Use this source when:

  • A native Extract connector does not exisit yet for the desired API endpoint
  • You need to ingest a small number of custom endpoints quickly
  • You can describe the response records with a stable JSON schema
Please feel free to submit an integration request via "Contact Us" for a fully managed integration to be developed

Prerequisites

Before setup, make sure you have:

  1. API access credentials (if required)
  2. Endpoint(s) you want to pull
  3. A known response structure (where records are located)
  4. A JSON schema properties object for each stream

Source Setup Guide

1) Base settings

  • Base URL: absolute URL with http or https (for example https://api.example.com/v1)

  • Auth Location:

    • None
    • Header
    • Query parameter

If auth location is Header/Query, you must set API Token.

2) Optional auth fields

  • Auth Header Name (default: Authorization) when using header auth
  • Auth Query Param Name (default: api_key) when using query auth

3) Optional default headers

Set Default Headers (JSON) to send headers on every request.

Example:

{
"Accept": "application/json",
"X-Client": "extract"
}

4) Optional template parameters

You can define source-level parameters and reference them in stream configuration using templating.

  • Template Params: key/value map available as {{param.<name>}} in templated fields.

Example usage:

  • {{param.account_id}}
  • {{param.region}}

5) Define Streams

Each stream is one API extraction flow and becomes one destination table.

Streams are optional at source creation time. You can create the source with no streams configured and add streams later.

Stream fields

Required fields

  • name: stream identifier / destination table name
  • endpoint_path: endpoint path (for example /orders) or full URL
  • records_path: dot-path to array of records in response
  • schema_json: JSON schema properties object

Optional metadata fields

  • description: free-text description for the stream
  • primary_key: array of field paths used as the stream primary key (for example ["id"] or ["order.id"])

Request fields

  • method: GET or POST (default: GET)
  • query_params_json: JSON object query parameters
  • headers_json: JSON object request headers
  • body_json: JSON object body (mainly for POST)

query_params_json, headers_json, and body_json must be JSON objects (not arrays). If omitted, they default to {}.

Extract Mode

Supported values:

  • FullRefresh (default)
  • IncrementalChanges
  • Partition

IncrementalChanges fields

  • timestamp_field (required): dot-path in each record (for example updated_at)
  • incremental_param_name (required): request parameter carrying cursor
  • incremental_param_location: query or body (default query)
  • timestamp_format: rfc3339 (default), unix_seconds, unix_millis, or custom chrono format
  • timestamp_timezone: timezone for parsing/formatting custom timestamps (default UTC)

Partition fields

  • start_date_param_name (required)
  • end_date_param_name (required)
  • date_param_location: query or body (default query)
  • date_format: chrono date format (default %Y-%m-%d)
  • date_timezone: timezone (default UTC)
  • partition_backfill_days: backfill window in days (default 30)
  • partition_key: partition key in catalog (default date)

Pagination

pagination_mode values:

  • none (default)
  • token (alias: cursor)
  • page
  • offset

Token pagination

Use one of:

  • next_cursor_path (+ cursor_param_name required)
  • next_link_path

Optional:

  • has_next_page_path
  • cursor_param_location (query/body, default query)

Page pagination

  • page_param_name (required)
  • limit_param_name (optional, usually per-page field)
  • page_size (optional)
  • page_start (default 1)

Offset pagination

  • offset_param_name (required)
  • limit_param_name (optional)
  • page_size (optional)
  • offset_start (default 0)

Common pagination guard

  • max_pages (optional): stop after N pages

Templating

Template placeholders are supported in:

  • base_url
  • endpoint_path
  • query_params_json
  • headers_json
  • body_json
  • auth token value

Supported placeholders include:

  • {{base_url}}
  • {{cursor}}
  • {{start_date}}
  • {{end_date}}
  • {{page_size}}
  • {{page}}
  • {{offset}}
  • any source-level parameter as {{param.<parameter_name>}}

Example stream (incremental)

{
"name": "orders",
"endpoint_path": "/orders",
"method": "GET",
"query_params_json": "{\"status\":\"completed\"}",
"records_path": "data.items",
"extract_mode": "IncrementalChanges",
"timestamp_field": "updated_at",
"timestamp_format": "rfc3339",
"incremental_param_name": "updated_after",
"incremental_param_location": "query",
"pagination_mode": "token",
"cursor_param_name": "cursor",
"next_cursor_path": "pagination.next_cursor",
"schema_json": "{\"id\":{\"type\":\"integer\"},\"updated_at\":{\"type\":\"string\",\"format\":\"date-time\"}}"
}

Example stream (date partition)

{
"name": "events_daily",
"endpoint_path": "/events",
"method": "POST",
"body_json": "{\"filters\":{\"type\":\"install\"}}",
"records_path": "rows",
"extract_mode": "Partition",
"start_date_param_name": "date_from",
"end_date_param_name": "date_to",
"date_param_location": "body",
"date_format": "%Y-%m-%d",
"partition_backfill_days": 14,
"pagination_mode": "page",
"page_param_name": "page",
"limit_param_name": "per_page",
"page_size": 500,
"schema_json": "{\"date\":{\"type\":\"string\"},\"count\":{\"type\":\"integer\"}}"
}

Connection Setup Guide

Once you connect Generic REST API to a destination, also configure:

  • Connection Pull Schedule
  • destination-specific settings (dataset/schema/table options)
  • Schema Migration Policy

Notes and troubleshooting

  • base_url must be an absolute URL and use http or https.
  • records_path must point to an array of objects. If missing, validation/run fails.
  • Supported HTTP methods are currently GET and POST.
  • pagination_mode supports token and the alias cursor.
  • Token auth in query mode avoids duplicating the auth query parameter when following a next_link URL.
  • For incremental mode, the max parsed timestamp from records is used as the next cursor.
  • For partition mode, the date-partition cursor advances according to your backfill window.
  • During parameter validation, the connector sends a real request to verify stream configuration.