Skip to main content

BOOTSTRAP YOUR DATABASE

Stream Snapshots

Stream Snapshots provide the latest event for each entity in a Companies House stream. Use them to bootstrap your own database, then keep it current from the official Streaming API.

Why it exists

The Companies House Streaming API is well suited to modern pipelines, but it only gives you events from the point you connect. For a complete local database, you still need a compatible starting state.

Stream Snapshots fill that gap by collecting years of previous stream events and publishing the most recent event for each entity. The snapshot is used once for the initial load. After that, your system should follow the first-party Companies House stream.

Delivery format

Source-shaped files for direct ingestion.

Each file is JSON lines. Each line contains one latest event, untransformed from the Companies House Streaming API format.

Delivery Format .json.zst

Compressed JSON lines, suitable for streaming import tools and standard decompression workflows.

Schema Unchanged

No translation layer is introduced between the snapshot and subsequent Streaming API events.

Entity Identifier resource_uri

Use the event resource URI as the record key when loading the latest entity image.

Continuation Token event.timepoint

Use the maximum timepoint in your loaded data to resume from the official stream.

How to use

The important constraint is timing: we recommend that you connect to the official stream within 7 days of generating the snapshot, because the Streaming API only exposes a limited recent history.

  1. 01

    Download the snapshot for the stream you want to maintain.

  2. 02

    Load the .json.zst file as JSON lines, using resource_uri as the primary key.

  3. 03

    Find the maximum event.timepoint in the loaded data.

  4. 04

    Connect to the official Companies House Streaming API from the next timepoint.

Stream coverage

The snapshots follow the nine entity streams exposed by the Companies House Streaming API. Most teams start with companies, officers, and persons with significant control.

Primary use cases

  • companies
  • officers
  • persons-with-significant-control

Event history

  • filings
  • charges
  • insolvency-cases

Specialist records

  • disqualified-officers
  • company-exemptions
  • persons-with-significant-control-statements

Access and pricing

Downloads are delivered through pre-signed URLs to a secure storage bucket. You need a Companies Catalogue account to generate a URL, which helps limit automated abuse of the service.

Most snapshots are free

Signing up only requires an email address and password. Card details are not required for the free streams.

Officers and filings are paid products

Officers and filings cost GBP 1,000 each as a one-off purchase. The purchase covers one snapshot download, after which your database should be maintained from the official stream.

Alternatives

The official bulk products remain useful, but they do not always line up neatly with the Streaming API. Companies data is distributed as monthly CSV. Officer data is supplied in a legacy fixed-width file. PSC data is closer in shape, but still lacks stream timepoints.

For several other entity types, there is no straightforward official bulk route to a complete historical dataset. Stream Snapshots are intended for teams that want the stream event format from the first row of their own database.

Open the snapshot catalogue