In secondaries, data quality matters because small errors in inputs can become large errors in returns. The question is not simply whether the data is ‘good‘, but whether it is good enough to support a view on how much cash will ultimately reach the LP, and when.
Why does this matter?
That distinction matters because secondary underwriting operates across two levels at once. At company level, the task is to assess what underlying portfolio companies may deliver and when exits may occur. At fund level, the task is to translate those outcomes into LP cash flows, net of carry, fees and fund debt. Data quality matters at both levels. At fund level, it is also the bridge between the two.
It is also why data quality is not just a back-office issue. In secondaries, the underwrite depends less on any single metric than on whether the economic picture is complete, reconciled and stable through time. In practice, that means three things: completeness, accuracy and consistency. Do we have the full set of relevant inputs, now and historically? Do the numbers tie out? And do key definitions remain stable over time?
The limitations and assumptions that pose risks
Completeness is often the real constraint. When data is missing, gaps have to be filled with assumptions, estimates or proxies. That is sometimes unavoidable, especially when underwriting large and diverse LP portfolios. But those assumptions do not stay local. They can flow through directly into expected proceeds, timing and net returns.
Accrued carry is a good example. One mistake we have commonly seen in vendor-extracted data is to capture the headline carry rate, but miss carry already embedded in the economics. For an AI underwriting model, that may not affect the forecast of gross asset proceeds. But it does affect the fund mechanics and the final forecast that matters to the secondary buyer: what ultimately reaches the LP after prior economic claims are satisfied.
How this works in practice
A simple forward-looking example shows the point. Assume a secondary investor pays 100 for a fund interest expected to generate 200 of future gross proceeds, and that the fund is already in carry. On a naive view, 20% carry would imply 40 of carry, 160 of net proceeds and a 1.60x MOIC. But if 20 of carry has already economically accrued and still has to be paid out of future cash flows, the maths changes. In that case, 20 of the future proceeds is first used to meet that accrued carry, leaving 180 on which a further 20% carry applies. Total carry paid from future proceeds therefore becomes 56, leaving 144 of net proceeds and reducing MOIC to 1.44x. The effective carry burden on future proceeds is not 20%, but 28%. In that scenario, the selling LP may have realised value before the full carry burden was felt, while the secondary buyer inherits a greater share of the carry drag in future proceeds.
The same principle applies to fund debt. If facility usage or repayment obligations are missed, future LP proceeds can again look stronger than they really are, because cash flows that appear distributable to the LP may in practice first be needed to service debt or restore facility balances. In both cases, the issue is the same: gross value may be forecast correctly, but net proceeds to the buyer can still be overstated.
That is why this matters particularly in AI-supported underwriting. Models may be effective at forecasting gross proceeds, but the underwriting decision depends on what remains after fund-level economics are correctly captured. If carry accruals, debt obligations or other economic claims are incomplete, a good forecast of gross value can still become a bad forecast of secondary returns.
The bottom line
The broader point is simple. In secondaries, data quality is not an operational hygiene issue. It is part of the investment case. Missing or unstable inputs do not just reduce confidence in a model; they can change expected proceeds, timing and realised returns. In a market where underwriting edge increasingly comes from understanding what value still remains for the buyer, disciplined data is a prerequisite for disciplined underwriting. AI can help identify gaps, test sensitivities and make assumptions explicit. But the underlying principle is simpler than the technology: if the job is to answer “how much and when”, the quality of the data will often determine the quality of our answer.
The opinions expressed herein reflect current opinions of Coller Capital as at the date of this article, which are subject to change at any time, and any assumptions or expectations on which they are based may fail to materialise.