Drupal is a registered trademark of Dries Buytaert
cms 2.1.0 Update released for Drupal core (2.1.0)! ai 1.3.1 Minor update available for module ai (1.3.1). seven 2.0.0-beta6 New beta version released for theme seven (2.0.0-beta6). seven 1.0.1-beta1 First beta version released for theme seven (1.0.1-beta1). modal_page 5.1.9 Minor update available for module modal_page (5.1.9). drupal_cms_helper 2.1.0 Minor update available for module drupal_cms_helper (2.1.0). eca 3.1.0-rc1 First release candidate for module eca (3.1.0-rc1). book 3.0.1 Minor update available for module book (3.0.1). domain_theme_switch 3.0.0 Major update available for module domain_theme_switch (3.0.0). cms_content_sync 3.2.0 Minor update available for module cms_content_sync (3.2.0). editoria11y 3.0.0-beta6 New beta version released for module editoria11y (3.0.0-beta6). forum 1.0.6 Minor update available for module forum (1.0.6). byte_theme 1.0.2 Minor update available for theme byte_theme (1.0.2). mercury 1.0.3 Minor update available for theme mercury (1.0.3). ai_provider_anthropic 1.2.2 Minor update available for module ai_provider_anthropic (1.2.2). acquia_dam 1.1.13 Minor update available for module acquia_dam (1.1.13). media_duplicates 2.0.4 Minor update available for module media_duplicates (2.0.4). eca 3.1.0-beta2 New beta version released for module eca (3.1.0-beta2). swiper_formatter 2.1.1 Minor update available for module swiper_formatter (2.1.1). solo 1.0.31 Minor update available for theme solo (1.0.31).

Drupal 10/11 module that extends the Feeds module with a Paginated HTTP Fetcher — a fetcher plugin that automatically walks through every page of a paginated API endpoint and delivers the combined result to the standard Feeds parse/process pipeline.
Standard Feeds HTTP fetcher retrieves a single URL and hands the raw response to the parser. This module replaces that single-request fetch with a loop that:

  • Fetches page 1 of the API.
  • Extracts the items array from the response (configurable).
  • Determines whether there is a next page (using one of four strategies — see below).
  • Repeats until there are no more pages, or a configured page limit is reached.
  • Merges all collected items into a single JSON array.
  • Returns that merged array to the Feeds pipeline as a RawFetcherResult.
  • The parser and processor configured on the feed type receive the merged data exactly as if the entire dataset had come from one page

Features

Pagination Strategies

  • Page number — increments a query parameter (e.g. ?page=1&per_page=100); configurable parameter name and starting value (0 or 1)
  • Offset — increments an offset parameter (e.g. ?offset=0&limit=100); configurable offset and limit parameter names
  • Link header (RFC 5988) — follows rel="next" from HTTP Link response headers
  • JSON next link — reads the next-page URL from a dot-notation path inside the JSON response body (e.g. links.next, pagination.next_url)

Item Extraction

  • Items key — dot-notation path to extract the items array from a response wrapper (e.g. data, results.items)
  • Root-level JSON array support — when items key is empty, uses the response array directly
  • Single-object wrapping — a bare JSON object response is automatically wrapped in an array

Batch Mode

  • Pages per batch — spreads fetching across multiple cron runs; persists resumption state between runs
  • Resumes correctly from the exact page, base URL, and current URL on the next cron run
  • Signals Feeds with setCompleted() when all pages are done

Memory & Execution Time Protection

  • Streaming temp file accumulation — writes items to a PHP tmpfile() page-by-page instead of accumulating in a PHP array, reducing peak memory usage
  • Memory threshold — stops the current batch early and saves state if PHP memory usage exceeds a configurable % of memory_limit (default 80%); skipped when memory_limit = -1
  • Execution time threshold — stops early if elapsed time exceeds a configurable % of max_execution_time (default 80%); skipped when max_execution_time = 0
  • Both thresholds log a warning and persist state so the next cron run resumes without data loss

Resilience & Retry

  • Retry on transient failures — configurable retry count (default 3) with exponential backoff
  • Retries on: connection errors (ConnectException), HTTP 5xx (ServerException), HTTP 429 rate limit (ClientException 429)
  • Retry-After header support — respects the server-supplied delay on 429 responses
  • Non-retryable errors (4xx other than 429, invalid JSON) fail immediately
  • Each retry attempt is always logged as a warning regardless of verbose logging setting

Request Configuration

  • Request timeout — per-request read timeout in seconds (default 30)
  • Connection timeout — separate Guzzle connect timeout (default 10), independent of the read timeout
  • Extra query parameters — static URL-encoded params appended to every request (e.g. api_key=abc&format=json)
  • Custom request headers — one Name: Value per line (e.g. Authorization: Bearer token)

Safety & Security

  • Maximum pages — hard cap on total pages fetched per import run (0 = unlimited)
  • SSRF protection — server-supplied next-page URLs (Link headers, JSON next link) are validated; only http:// and https:// schemes accepted
  • HTTP header injection prevention — CR/LF characters stripped from all custom header names and values

Logging

  • Verbose import logging — per-feed toggle; logs page URLs, item counts, pagination decisions, batch state changes, and resource threshold triggers
  • Always-on error logging — HTTP failures, invalid JSON, non-array responses, and retry attempts are always written to the feeds_paginated_fetcher watchdog channel regardless of the verbose flag

UI & Configuration

  • Per-feed configuration form with conditional field visibility (strategy-specific fields shown/hidden via #states)
  • Settings placed in a Pagination settings vertical tab alongside Feeds' built-in tabs
  • Form validation for URL scheme, extra query params format, per-page minimum, next-link path requirement, and percentage field ranges
  • All settings have sensible defaults; existing feeds without new config keys automatically receive defaults

Compatibility

  • Drupal 10 and 11
  • Requires the Feeds 3.x contrib module
  • PHP 8.3+

Post-Installation

Follow README.md for all post installation configuration with example

Activity

Total releases
2
First release
Mar 2026
Latest release
16 hours ago
Release cadence
0 days
Stability
0% stable

Releases

Version Type Release date
1.0.0-rc1 Pre-release Mar 22, 2026
1.0.x-dev Dev Mar 22, 2026