Integrating ESG Data into Quant Models — OpenESG Resources

The ESG data landscape for quant teams

Integrating ESG data into quantitative models is more complex than integrating financial data. ESG data has lower coverage, higher latency, inconsistent periodicity, multiple competing providers, and significant missing data rates — especially for smaller companies and emerging markets. Before ingesting ESG data, quant teams need to understand its structural characteristics.

Characteristic	Financial Data	ESG Data
Update frequency	Daily / real-time	Annual (with 12–18 month lag)
Coverage	~100% for listed companies	~80% large caps; ~40% small caps
Comparability	High (IFRS/GAAP)	Low (multiple standards, self-reported)
Audited?	Yes (financial statements)	Rarely (ESG assurance growing but not universal)
Provider correlation	~0.99	~0.54 across major ESG raters
Missing data rate	<1% for key metrics	10–60% depending on metric

Choosing ESG data providers

The choice of provider significantly affects model results. Given the low correlation between providers (~0.54), using a single provider introduces substantial provider-specific bias. Best practice for institutional quant strategies is to either aggregate multiple providers or understand clearly why one provider's methodology is superior for the specific use case.

MSCI ESG Ratings: Industry-standard for institutional use; strong governance and social coverage; annual scores with controversy overlays
Sustainalytics (Morningstar): Risk-score framework (Unmanaged ESG Risk); strong controversy monitoring; widely used for exclusion screening
Refinitiv (LSEG): Large universe coverage; good for emerging markets; raw data available at disclosure level
S&P Global CSA: Based on annual corporate sustainability assessment; sector-adjusted; strong for DJSI-tracking strategies
OpenESG API: Real-time scores with AI controversy monitoring; framework-level breakdown; suited for ESG factor research
CDP Scores: Best-in-class for climate and water data specifically; A–F scale; annually updated; used for climate transition risk models

Data ingestion patterns

Point-in-time data construction

The most critical technical requirement for backtesting is using point-in-time (PIT) data — the score as it existed at a specific historical date, not the score as restated or revised. ESG providers periodically revise historical scores. Using revised data in a backtest is a form of look-ahead bias.

Look-ahead bias in ESG backtests

ESG scores are often updated retroactively when a company revises its reporting. If your backtest uses today's data for historical periods, you are introducing significant look-ahead bias. Always request point-in-time data from your ESG provider, or clearly document the assumption that revised data is used.

python

import pandas as pd
import openesg

client = openesg.Client(api_key="sk-...")

# Fetch point-in-time ESG scores for a universe
# as_of parameter requests data as it existed at that date
scores = client.scores.history(
    tickers=["AAPL", "MSFT", "GOOGL", "TSLA"],
    fields=["composite_score", "e_score", "s_score", "g_score"],
    start="2020-01-01",
    end="2025-12-31",
    frequency="quarterly",
    as_of="point_in_time"  # avoid look-ahead bias
)

df = pd.DataFrame(scores)
print(df.head())

Handling the publication lag

ESG reports typically describe the prior year and are published 3–6 months after year-end. An ESG score updated in April 2026 based on 2025 data should only be used in models starting from the April 2026 publication date. Assuming the score was available on January 1 2026 introduces a 3–4 month look-ahead bias.

Normalisation across sectors

Raw ESG scores should not be compared directly across sectors because material ESG risks vary dramatically by industry. A technology company with a 70 environmental score and an oil company with a 70 environmental score are not environmentally equivalent — the oil company's score reflects much higher absolute emissions in a carbon-intensive sector.

Within-sector normalisation

The standard approach for cross-sectional ESG factor construction is sector-neutral normalisation: convert raw scores to z-scores within each GICS sector, then use the normalised score in portfolio construction.

python

import numpy as np

def sector_neutral_zscore(df, score_col="esg_score", sector_col="gics_sector"):
    """Convert ESG scores to within-sector z-scores."""
    df = df.copy()
    df["esg_z"] = df.groupby(sector_col)[score_col].transform(
        lambda x: (x - x.mean()) / x.std()
    )
    # Winsorise at ±3σ to limit outlier influence
    df["esg_z"] = df["esg_z"].clip(-3, 3)
    return df

# Apply to your holdings universe
scored = sector_neutral_zscore(holdings_df)

Missing data strategies

Missing data is the norm in ESG, not the exception. Your treatment of missing data should be explicit, documented, and consistent. The main approaches are:

Strategy	When to use	Risk
Mean/median imputation by sector	General use for minor gaps (<20%)	Artificially compresses variance; may introduce sector bias
Multiple imputation	Random missing data, high coverage universe	Computationally expensive; adds complexity
Explicit missing category	Systematic non-disclosure (ESG controversy)	Requires interpretation — is missing = bad, or missing = unknown?
Provider ensemble average	Disagreement between providers	May dilute signal if providers have systematic biases
Exclude from universe	High missing rate (>50%) or critical metric	Survivorship bias if exclusion correlates with ESG performance

Missing = signal

For some ESG factors, missing data is itself a negative signal. Companies that fail to disclose Scope 3 emissions are more likely to have high emissions they do not want to publicise. Treating missing Scope 3 as 'unknown' rather than 'probably bad' may understate the ESG risk of the portfolio.

Backtesting ESG factors

ESG factor backtests have historically shown mixed results, which is expected — ESG is not a single factor but a multi-dimensional risk overlay. The strongest documented ESG effects are in controversy monitoring (negative news predicts short-term negative returns) and governance quality (strong governance predicts long-term valuation multiples).

Use long formation periods (quarterly rebalancing is standard for annual ESG data)
Test on a value-weighted portfolio, not equal-weighted, to match realistic execution
Decompose returns into E, S, and G components separately — the composite often masks different directional signals
Test the turnover and transaction costs — ESG strategies tend to have low turnover but high rebalancing costs at threshold crossings
Be explicit about whether the strategy is long-short (factor) or long-only with ESG tilt