Data as a Balance Sheet Asset: Valuation Methods for the AI Era
In December 2024, the System of National Accounts (SNA 2025) made an official decision: data is now to be treated as a productive capital asset, comparable in conceptual status to plant and equipment, software, and intellectual property. This represents a watershed moment in the economic treatment of intangible assets. For the first time, a major accounting framework explicitly recognises that data — structured, owned, and used to generate economic returns — meets the definition of capital: a productive resource that generates future economic benefits.
The policy decision is clear. The practical question is harder: how do you value data on a balance sheet? I have spent thirty years structuring asset-backed securities and evaluating assets for lending and investment. I have learned that valuing anything requires three approaches: what did it cost to create, what will someone pay for it, what returns does it generate. Data is no different.
£2.9T
Estimated value of data assets globally (2025)
68%
of data's value derives from its use in AI, not from direct sales
3.2x
Average revenue multiplier from proprietary data assets in financial services
When Does Data Meet the Definition of an Asset?
Before exploring valuation methods, it is worth clarifying when data qualifies as a balance sheet asset under SNA 2025 and, increasingly, under emerging IFRS guidance.
Data is a capital asset if:
It is controlled — The organisation has the legal right to exclude others from using it. This means proprietary data (data the company has collected, compiled, or licensed exclusively) not public data or data the organisation merely accesses.
It generates future economic benefits — The data must be used to generate revenue, reduce costs, improve product quality, or create competitive advantage. Training data for an internal AI model meets this criterion. Unused datasets in a data warehouse do not.
The benefits are probable — The organisation must have reasonable confidence that the data will generate the expected economic returns. Historical performance or a documented use case is required.
The asset is separately identifiable — The dataset must be defined, bounded, and measurable separately from other assets. "All our customer data" is too vague. "Customer transaction data from 2018-2025, cleaned and deduplicated, used to train our fraud detection model" is separately identifiable.
Not all data meets these criteria. A company's generic operational data that no one outside the organisation would pay for, that is not used in any revenue-generating or cost-reducing application, is typically expense, not asset.
But proprietary data assets — datasets that a company has invested in developing, maintaining, and using to create competitive advantage — absolutely qualify. A financial services firm's proprietary credit risk data. A healthcare company's de-identified patient outcome data. An ecommerce company's user behavioural data used to power recommendation engines. These are assets.
★ Key Takeaway
Data is a capital asset if it is controlled, generates measurable future economic benefits, is probable to produce those benefits, and is separately identifiable. Not all data meets these criteria, but proprietary datasets used in competitive applications do.
The Three Valuation Approaches: Framework
Standard valuation practice, whether for real estate, equipment, or intellectual property, employs three methodologically distinct approaches. Each provides a different perspective on value.
Approach 1: Cost Approach
The cost approach values an asset at what it cost to create it. This is the most straightforward approach but also the least economically meaningful. A book is not worth what it cost to print it. But for an asset-in-creation, cost provides a lower bound on value.
For data, the cost approach captures:
Data acquisition: If data was purchased, licensed, or acquired from third parties, the direct cost of acquisition is the floor.
Data creation: If data was generated internally — through sensors, transactions, user interactions — the cost to create and collect the data. This includes infrastructure (servers, sensors), personnel (data engineers, analysts), and operational costs.
Data preparation: Cleaning, deduplication, standardisation, privacy protection (anonymisation, encryption). This is often 60-70% of the total cost of usable data. Raw data is typically not valuable — prepared data is.
Data integration: Combining data from multiple sources, structuring for use, building data pipelines. This is core infrastructure investment.
Compliance and governance: Legal review, compliance controls, security measures, documentation. GDPR, CCPA, and equivalent compliance costs can be substantial.
The cost approach answer: A company invested £2 million acquiring, cleaning, integrating, and securing a customer transaction dataset. Therefore, the dataset is worth at least £2 million, but likely less (costs are sunk, historical value).
ℹ Note
The cost approach provides a lower bound and a gut-check, but it is economically backward-looking. A dataset that cost £2 million to create but generates £20 million in annual revenue has far more value than its cost. Conversely, a dataset that was cheap to create but is now obsolete has less value than it cost.
Approach 2: Market Approach
The market approach values an asset by reference to what similar assets sell for in an active market. If a large market for comparable datasets existed, market prices would be the most reliable valuation. In reality, markets for data are nascent and thin.
However, some data markets do exist, and they provide useful reference points:
Data brokers and information providers: Companies like Nielsen, Equifax, CoreLogic, and Crunchbase sell datasets or data subscriptions. These markets show that data has a price. A subscription to a financial data service might cost £10,000-£1 million annually, depending on scope and exclusivity.
Comparable data transactions: When companies are acquired or divested, the acquiring firm may value the data separately. M&A databases and patent/IP licensing databases sometimes reveal data values in transaction documentation. A venture capital database with client profiles might be valued at £500K-£5 million in a transaction, depending on coverage and quality.
Data licensing and affiliate models: Some companies license their data to third parties. A company that licenses its proprietary customer data to complementary service providers for £500K annually is implicitly valuing that data at the capitalised value of the licence revenue (at a 20% discount rate, this implies a data asset value of £2.5 million).
Industry benchmarks: Surveys occasionally capture the cost of data purchases or subscriptions. A benchmark suggesting that healthcare organisations pay an average of £50,000 for access to de-identified patient outcome data on condition X helps value a proprietary dataset of similar scope.
The market approach answer: Comparable proprietary customer datasets have sold for 0.5x to 2.0x annual revenue in M&A transactions. This company's customer data, if monetised through licensing, might generate £1 million annually. Applying a 1.0x multiple suggests a value of £1 million.
The limitations of market approach: Data markets are immature. Truly comparable transactions are rare. Most data value is realised internally (cost reduction, revenue uplift) rather than through sale. Market comparables, when available, are often poorly documented.
Approach 3: Income Approach
The income approach values an asset based on the future economic returns it will generate, discounted to present value. This is the most economically sound approach and the most widely used in structured finance and investment analysis.
For data, the income approach captures the quantifiable value the data creates:
Cost avoidance: Proprietary data used to optimise operations generates value through cost reduction. A logistics company with proprietary route optimisation data reduces fuel costs. A retailer with proprietary inventory forecasting data reduces holding costs. Quantify the annual cost savings the data enables, discount to present value, and you have the data's value as a cost-avoidance asset.
Revenue uplift: Proprietary data used in product development or customer engagement generates incremental revenue. An ecommerce company with proprietary user behaviour data powering recommendation engines generates measurable uplift in conversion rate and average order value. A financial services firm with proprietary risk data can originate loans at lower cost than competitors. Quantify the incremental annual revenue attributable to the data, apply a contribution margin, and discount to present value.
Risk reduction: Proprietary data used in risk assessment generates value through reduced losses. A credit card issuer with proprietary fraud detection data reduces fraud losses. An insurance company with proprietary claims data reduces claims severity. The value is the discounted present value of avoided losses.
Competitive premium: Proprietary data that competitors cannot easily replicate generates a sustainable competitive advantage. This advantage translates into premium pricing, higher margins, or market share. The value is the discounted present value of the incremental profit attributable to competitive advantage derived from proprietary data.
Income Approach in Practice: Financial Services Example
A regional bank invests £3 million to acquire and integrate a proprietary credit risk dataset covering 500,000 SMEs. The dataset contains 10 years of historical credit performance, funding sources, and market indicators. The bank uses this data to originate loans to SMEs at lower interest rates (higher margins due to lower assessed risk). The data enables the bank to originate an incremental £50 million in SME loans annually, with 15% contribution margin (profit after cost of funds and overhead). Annual profit from the data-enabled business is £7.5 million. The bank expects the data to remain competitively useful for 7 years before regulatory changes or market developments reduce its value. Discounting at 8% (appropriate for data-dependent competitive advantage), the present value is: £7.5M / 1.08 + £7.5M / 1.08^2 + ... £7.5M / 1.08^7 = £38 million. The dataset's value is £38 million.
The Three Approaches Applied: A Sample Dataset
Consider a fictional but realistic case: a healthcare technology company with a proprietary dataset of de-identified patient outcomes from 2 million patient records, covering 15 years of treatment data for cardiac conditions.
Cost Approach Valuation:
- Initial data acquisition from hospital partners: £500K
- Data cleaning, de-identification, compliance (HIPAA, local regulations): £1.5M
- Ongoing data maintenance, updates, quality assurance (amortised over 10 years): £200K/year = £2M over lifetime
- Infrastructure (secure data warehouse, access controls, backups): £800K
- Legal and compliance (IP protection, licensing agreements): £300K
- Total cost: £5.1 million
Cost approach value: £5.1 million (lower bound).
Market Approach Valuation:
- Research shows healthcare datasets with similar scope (2+ million records, 10+ years, clinical breadth) have sold in M&A transactions at valuations ranging from £2 million to £30 million
- The variation reflects data quality, exclusivity, and use case
- This dataset is proprietary and exclusive but faces increasing competition from biotech firms assembling similar datasets
- Comparable transaction suggests valuation range: £8-£15 million
- Market approach value: £10 million (mid-range of comparables)
Income Approach Valuation:
- The company licenses the data to pharmaceutical firms for clinical trial design and pharmacoeconomic analysis
- Annual licensing revenue: £2 million
- The company also uses the data to build AI-powered diagnostic tools, generating product licensing revenue and margin uplift
- The data reduces clinical trial design time for internal product development, saving £1.5 million annually in R&D costs
- Total quantifiable annual value from data: £3.5 million
- Expected useful life: 8 years (after which clinical practice guidelines will have changed sufficiently to reduce data value)
- Discount rate: 12% (appropriate for data-dependent healthcare assets, reflecting IP risk and regulatory uncertainty)
- Present value: £3.5M × [sum of discount factors for years 1-8 at 12%] = £3.5M × 4.968 = £17.4 million
Income approach value: £17.4 million.
Reconciling the Three Approaches
| Approach |
Value |
Rationale |
| Cost |
£5.1M |
Lower bound; reflects historic investment |
| Market |
£10M |
Mid-range of comparable transactions |
| Income |
£17.4M |
Reflects current economic returns |
| Weighted average |
£11-13M |
Most defensible balance sheet value |
A reasonable balance sheet valuation, synthesising all three approaches, is £12 million. This reflects that the data is worth more than what it cost (because it generates returns), is worth more than market comparables (because it is superior in quality or exclusivity), and is worth less than its pure income value (because of risk and uncertainty in sustained returns).
✔ Example
A company using the cost approach exclusively would undervalue the data by 60%. A company using the income approach exclusively might overvalue the data if future returns are not realised. A balanced approach, where income approach carries 60% weight, market approach 25%, and cost approach 15%, produces a defensible and auditable valuation.
Data Asset Valuation in Context: The Problem of Appreciation
Data differs from traditional assets in one crucial respect: it can appreciate rather than depreciate. A machine depreciates — it wears out, becomes obsolete, is eventually scrapped. A dataset can improve — as more data accumulates, quality increases, linkages with other data create new value, and AI training produces better models.
This creates a problem for standard asset accounting, which assumes assets decline in value over time. A dataset that was worth £10 million at creation might be worth £25 million after five years of growth and improvement. How do you amortise an appreciating asset?
Current accounting standards struggle with this. IFRS and US GAAP assume that intangible assets decline in value and are amortised accordingly. A data asset valued at £12 million might be amortised over 5 years (straight-line £2.4 million annual amortisation). But if the data is actually appreciating — generating more value each year as it grows — the amortisation charge is economically misleading.
This is where SNA 2025's treatment of data is more economically accurate. SNA 2025 recognises that data assets can be revalued upward if they are generating increasing returns. The valuation question becomes dynamic: what is the data worth now, given current use, current returns, and current competitive position.
For balance sheet purposes, companies should consider adopting revaluation models for data assets, with annual re-valuation based on income approach (measuring actual returns being generated) and market approach (monitoring external data sales that might indicate value shifts).
Data as Collateral: The Emerging Finance Dimension
In structured finance, valuation of collateral determines advance rates (how much credit can be extended against the asset). For traditional assets — real estate, equipment, inventory — this is straightforward. For data, it is emerging.
A company with a proprietary dataset valued at £12 million might be able to borrow against that data asset. Lenders will require:
Strict valuation standards: The income approach must be based on documented, audited historical returns, not projections. If the data has been generating £3.5 million annually in returns for three years, lenders will be confident. If projections suggest £5 million in future returns, lenders will be sceptical.
Security and control: The lender must have confidence that the data cannot be deleted, degraded, or made inaccessible. This requires legal controls, escrow arrangements, and insurance.
Measurable returns: The data's value must be continuously verifiable. If returns are embedded in proprietary algorithms or internal product decisions, the lender must be able to audit the returns being generated.
Exclusivity: Data that the borrower can easily replicate or obtain elsewhere is poor collateral. Proprietary data with barriers to replication is better.
Given these requirements, data-backed lending is emerging in specialized niches — financial services firms lending against proprietary trading data, technology firms lending against user behaviour datasets — but it is not yet mainstream. Valuation standards and collateral definitions are still developing.
The Balance Sheet Integration: Recognition, Measurement, and Disclosure
When a company capitalises a data asset at £12 million (based on the valuation exercise above), the accounting treatment flows through to key financial metrics:
Balance sheet: The asset appears on the balance sheet under Intangible Assets (or a sub-category within Intangible Assets: Data Assets). The company's total assets increase.
Amortisation: The asset is amortised over its useful life. If the company assumes a 5-year life, annual amortisation is £2.4 million. This reduces annual profit.
Return on assets: The asset increases the denominator, potentially reducing return on assets metrics (unless the data generates offsetting profit increases).
Impairment testing: Annually, the company must test whether the asset remains fairly valued. If data returns decline, or if competitive alternatives emerge, the asset may be impaired (written down).
Disclosure: Under IFRS and US GAAP, data assets require disclosure of:
- Cost and accumulated amortisation
- Useful life assumptions and amortisation method
- Impairment history and testing methodology
- Valuation methodology applied
- Sensitivity to key assumptions
For PE buyers, this transparency is valuable. A seller that presents data assets with documented valuation, proven returns, and clear useful-life assumptions presents a more credible acquisition profile than a seller where data assets are expensed invisibly.
Looking Forward: Towards Standardised Data Valuation
The SNA 2025 recognition of data as capital is the beginning of a broader standardisation effort. International accounting standard setters (IASB, FASB) are beginning to develop specific guidance on data asset capitalisation and valuation. Industry groups in financial services, healthcare, and technology are developing provisional standards for data valuation.
Within the next 2-3 years, we can expect:
- IFRS guidance on data asset recognition: Specific criteria for when data qualifies as an asset, how to measure it, and how to handle appreciation and depreciation
- Sector-specific standards: Financial services, healthcare, and technology sectors will develop tailored guidance
- Data valuation methodologies: Standardised approaches for applying cost, market, and income approaches to different types of data
- Audit standards: Auditor guidance on how to validate data asset valuations
Companies that invest now in structuring, documenting, and valuing their data assets will be ahead of the curve when these standards crystallise.
★ Key Takeaway
Data asset valuation is not purely a technical exercise. It is a strategic decision about how your organisation treats data — as an invisible cost, or as a capital asset that generates measurable returns. The companies that will extract the most value from their data will be those that measure it, value it, manage it, and present it to investors and acquirers with rigorous, defensible frameworks.
Tony Hillier is Co-Founder of Opagio. He holds an MA from Balliol College, Oxford and an MBA with distinction. His career includes executive board roles at NM Rothschild & Sons and GEC Finance, and a non-executive directorship at Financial Security Assurance in New York, where he specialised in structured finance, asset-backed securities, and innovative collateral frameworks.