Skip to main content

Daily extracts from the Charity Commission

Charity Register Data

The Charity Commission publishes daily bulk extracts of the UK register of charities. We convert the original ZIP deliveries into efficient, pipeline-ready Parquet, zstd-compressed JSON Lines and zstd-compressed CSV.

The delivery challenge

The official extracts are delivered daily as ZIP archives containing JSON and TSV files. While public, the packaging makes them difficult to use in modern streaming pipelines.

The ZIP central directory structure prevents streaming decompression from a remote URL, so the entire archive must be downloaded before processing can begin. The JSON is provided as a single array rather than JSON Lines; for files over 1 GB this is awkward for incremental parsing and too large to load fully into memory for many pipelines.

Our approach

Clean formats for data pipelines.

We retrieve the daily ZIPs, decompress them, parse the contents and republish the data in formats designed for high-performance ingestion, streaming and analytics workloads.

Formats Parquet, ndjson, CSV

Choose the format that best suits your stack. All are compressed where appropriate for efficient transfer and storage.

Streaming support JSON Lines

Zstd-compressed JSON Lines (ndjson) lets you process records one at a time without loading the whole file.

Analytics ready Parquet

Columnar Parquet files are ideal for analytical queries, Spark, DuckDB, BigQuery and other engines.

Universal CSV + zstd

Zstd-compressed CSV remains simple to load with standard tools while keeping transfer sizes manageable.

What the data covers

The register contains core information about charities together with supporting records on finances, governance, operations and regulatory events. The core Charity table is the essential foundation; all other records relate back to it.

Required

Core Charity Register

The Charity record is the primary master table and the single most important dataset for any integration. It provides the core profile for approximately 400,000 charities and related parent/child organisations, including registration details, status, contact information, basic financial summaries and key attributes. Every other table links back to these records.

Financial & Reporting History

Submission history of annual returns with key dates and high-level financial figures, plus detailed Part A financials (income, expenditure, fundraising, staff salaries and related metrics) and Part B financials for larger charities (full income/expenditure breakdowns, assets, liabilities and reserves).

Governance & Structure

Current trustees and chairs with appointment details, the charity’s governing documents and charitable objects, and details of registered policies in place.

Operations & Purpose

Activity classifications that describe what the charity does, who it helps and the methods used, together with the geographic areas of operation.

Regulatory & Compliance

Published reports, inquiries and warnings issued by the Commission, records of oversight by other regulators, and historical events such as registrations, removals and transfers.

Identifiers & Aliases

Working names, previous names and other identifiers used by charities over time.

Access

The charity data files are all available free of charge, without authentication, on a public storage bucket. The core Charity table is the essential starting point for almost any use of the register.

Modern pipeline formats

Parquet for analytics, zstd JSON Lines for streaming import, and zstd CSV for broad compatibility. No more full-archive downloads or memory-intensive array parsing before you can begin processing.

Daily cadence preserved

We process the Commission’s daily publications so you receive fresh data on the same schedule, in formats that drop straight into your existing ingestion workflows. All files are hosted on a public storage bucket with no authentication required.