Skip to main content

Financial Accounts ready for ingestion

Tabular financial data

Bypass the complexity of raw XBRL parsing. We provide high-performance CSV and Parquet datasets for UK company financials.

View downloads

The format challenge

UK company accounts indicate the financial position of every business and are highly valuable. The vast majority are filed in machine-readable XBRL format.

However, processing XBRL at scale is technically challenging. A specification-compliant parser must resolve extensive taxonomy references, and the varying outputs from different accounting software add further complexity. Open-source parsers can take several seconds per document, making bulk ingestion of millions of accounts a slow and costly undertaking.

<xbrli:context id="company">
  <xbrli:entity>
    <xbrli:identifier scheme="http://www.companieshouse.gov.uk/">11932084</xbrli:identifier>
  </xbrli:entity>
  <xbrli:period>
    <xbrli:startDate>2025-05-01</xbrli:startDate>
    <xbrli:endDate>2026-04-30</xbrli:endDate>
  </xbrli:period>
</xbrli:context>

                        <ix:nonFraction contextRef="company-period-end" decimals="0" format="ixt:numdotdecimal" name="core:FixedAssets" scale="0" unitRef="GBP">854</ix:nonFraction>
<ix:nonFraction contextRef="company-period-end" decimals="0" format="ixt:numdotdecimal" name="core:CurrentAssets" scale="0" unitRef="GBP">150,031</ix:nonFraction>
<ix:nonFraction contextRef="company-period-end" decimals="0" format="ixt:numdotdecimal" name="core:TotalAssetsLessCurrentLiabilities" scale="0" unitRef="GBP">150,164</ix:nonFraction>

The Solution

A trivial data pipeline.

Companies Catalogue transforms this complex parsing problem into a straightforward operation. You can run a daily batch process to fetch the latest accounts and maintain a queryable local database.

Delivery
CSV & Parquet

Standard tabular formats optimised for modern data pipelines, analytics, and direct database loading.

Data Points
38 Fields

The most common financial data points, extracted cleanly without the need to resolve raw XBRL taxonomies.

Cadence
Daily Updates

The dataset is updated each day, including all electronically filed accounts from the previous day.

Infrastructure
Public Storage

Hosted on reliable object storage in the Western Europe region to ensure high-performance downloads for your ingestion pipelines.

Open Source Pipeline


import boto3
from stream_read_xbrl import stream_read_xbrl_sync_s3_csv
import os

if __name__ == '__main__':
    s3_client = boto3.client('s3', region_name='eu-west2')
    bucket_name = os.getenv('XBRL_CSV_BUCKET')
    key_prefix = 'xbrl/'

    stream_read_xbrl_sync_s3_csv(s3_client, bucket_name, key_prefix)

Powered by efficient tooling

This tabular accounts dataset is made possible by the stream-read-xbrl package, an open-source Python library developed by the UK Government Department for Business and Trade.

Using this high-performance parser, we process all electronically filed accounts and convert them into our accessible formats. By maintaining this pipeline, we unlock business value and put financial data within reach for small software and data teams, where previously it was only accessible by large enterprises with the resources to build custom XBRL processors.

Access and pricing

The data product is freely available. We believe that improving access to company financial records benefits the wider ecosystem of UK businesses.

Recent data is open to all

The last two years of accounts data can be downloaded from a public URL without needing to register or create an account.

This supports automated pipelines by allowing your servers to download data over public IP without authentication.

Full history requires a free account

For further history going back to 2008, a Companies Catalogue account is required. This prevents automated abuse of our servers; the data remains available at no cost.