Synthesis

While the standard-resolution IFS-FESOM2-SR maintains a relatively stable mean state, the eddy-rich models exhibit severe energetic drifts (IFS-NEMO cooling/salinifying) and persistent structural biases in the Arctic halocline and Western Boundary Currents.
The evaluation of high-resolution coupled models against EN4 observations reveals significant initialization shocks and mean-state drifts that complicate the detection of anthropogenic signals. Contrary to expectations that higher resolution automatically improves mean state, the standard-resolution IFS-FESOM2-SR generally outperforms the eddy-rich (ER) configurations in global surface statistics (SST RMSE ~0.91 K), whereas IFS-NEMO-ER and ICON-ESM-ER exhibit large-scale disequilibria. IFS-NEMO-ER is a distinct outlier, dominated by a pervasive global cooling (-1.05 K) and monotonic salinification (+0.18 PSU) drift, suggesting a fundamental non-conservation of energy and freshwater (likely P-E < 0) in the coupling. In contrast, ICON-ESM-ER displays a 'fresh surface/warm deep' bias structure, with extreme regional SST errors (>3 K) driven by western boundary current overshoots (Gulf Stream, Kuroshio) that are not damped by parameterizations. Despite the explicitly resolved eddies, persistent systematic biases common to lower-resolution CMIP6 models remain. All three models exhibit a severe positive salinity bias (> +3 PSU) in the Arctic Ocean, indicating a failure to maintain the surface halocline potentially due to vertical mixing parameterizations or runoff deficiencies. Additionally, the characteristic 'cold blob' bias in the North Atlantic subpolar gyre persists, particularly in ICON and FESOM, linked to freshwater capping and Gulf Stream separation issues. The divergence in deep ocean trends—IFS-NEMO cooling while ICON/FESOM warm below 2000m—highlight that short, high-resolution runs are strongly dominated by spin-up drift, necessitating careful energetic tuning before these models can reliably project decadal ocean heat uptake.

Related diagnostics

radiation_budget_toa atmospheric_precip_bias sea_ice_extent

Salinity Depth-Layer Time Series

Salinity Depth-Layer Time Series
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER, EN4
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014

Summary high

This figure displays time series (1980–2014) of global volume-weighted mean salinity in three depth layers (0–700 m, 700–2000 m, 2000 m–bottom) for three climate models compared to EN4 v4.2.2 observations.

Key Findings

  • IFS-NEMO-ER exhibits a large, systematic positive salinity bias (~0.16–0.18 PSU) across all depth layers, indicating a global salt inventory mismatch relative to EN4.
  • ICON-ESM-ER shows the best agreement with observations, tracking EN4 closely with minimal drift and biases generally less than 0.02 PSU (slightly fresh in the upper ocean, slightly salty at depth).
  • IFS-FESOM2-SR shows a distinct vertical bias structure: a fresh bias in the upper ocean (0–700 m, ~0.07 PSU) transitioning to a slight salty bias in the intermediate and deep layers.

Spatial Patterns

While spatial patterns are averaged out, the vertical profile reveals distinct behaviors: IFS-NEMO-ER is consistently too salty throughout the water column; IFS-FESOM2-SR redistributes salt (fresh surface, salty deep); and ICON-ESM-ER maintains a vertical profile very close to observations.

Model Agreement

There is significant divergence in mean state between models. IFS-NEMO-ER is an outlier with high salinity. ICON-ESM-ER and EN4 agree well. IFS-FESOM2-SR falls between the two in the deep ocean but is the freshest in the upper ocean.

Physical Interpretation

The constant offset in IFS-NEMO-ER across all depths suggests an initialization issue where the model's total salt content exceeds that of the EN4 climatology, rather than a dynamical drift (as trends are relatively flat). The IFS-FESOM2-SR fresh surface/salty deep pattern implies issues with vertical mixing or surface freshwater flux (P-E) distribution, preventing adequate salt transport to the surface or retaining too much fresh water in the upper layer.

Caveats

  • Global volume-weighted means can mask compensating regional biases (e.g., fresh Atlantic vs. salty Pacific).
  • Deep ocean observations (EN4) prior to the Argo era (pre-2000s) are sparse and rely heavily on climatological interpolation, increasing uncertainty in the deep layer reference.

Salinity Hovmoller (first-timestep anomaly)

Salinity Hovmoller (first-timestep anomaly)
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014

Summary high

This Hovmoller diagram illustrates the temporal evolution of global mean ocean salinity anomalies (relative to the initial state) across depth layers for EN4 observations and three high-resolution climate models from 1980 to 2014.

Key Findings

  • IFS-NEMO-ER exhibits a strong, systematic salty drift (>0.08 PSU) in the upper 1000 m, coupled with a compensating freshening anomaly between 1000-2000 m.
  • IFS-FESOM2-SR and ICON-ESM-ER show the opposite trend in the upper ocean: a persistent freshening drift in the top 500 m, with IFS-FESOM2-SR showing slightly stronger freshening penetrating deeper than ICON.
  • Observational data (EN4) shows complex multi-decadal variability, including freshening at intermediate depths (500-1000 m) and salinification in near-surface waters after 2000, patterns not fully reproduced by the free-running models.
  • The deep ocean (>3000 m) remains stable across all simulations and observations, showing minimal drift over the 35-year period.

Spatial Patterns

The anomalies are strongly stratified by depth. While the deep ocean is quiescent, the upper 1000 m shows significant decadal trends. Seasonal cycles are clearly visible in the surface layers of all models.

Model Agreement

There is significant divergence in model drift behavior. IFS-NEMO-ER drifts towards a saltier upper ocean, whereas IFS-FESOM2-SR and ICON-ESM-ER drift towards a fresher upper ocean. None of the models accurately capture the specific timing and vertical structure of the observed variability in EN4.

Physical Interpretation

The dipole pattern in IFS-NEMO-ER (salty upper/fresh intermediate) suggests potential issues with vertical mixing processes, intermediate water mass formation (ventilation), or surface freshwater flux imbalances (excessive evaporation vs precipitation). The surface freshening in FESOM and ICON suggests a net freshwater input surplus or insufficient vertical mixing of salty surface waters downward. The stability of the deep ocean indicates that the drifts are largely confined to ventilated thermocline and intermediate waters.

Caveats

  • The figure displays global means, which may obscure opposing regional drifts (e.g., Atlantic vs. Pacific).
  • Anomalies are relative to the first timestep, conflating initial adjustment shocks (spin-up) with long-term model drift.
  • The 35-year period is relatively short for diagnosing deep ocean trends.

Salinity Hovmoller (EN4-ref anomaly)

Salinity Hovmoller (EN4-ref anomaly)
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014

Summary high

This Hovmoller diagram illustrates the temporal evolution of global-mean salinity anomalies relative to the EN4 initial reference profile (1980), revealing severe drift issues in IFS-NEMO-ER compared to other models. While observational data (EN4) remains stable, the models exhibit distinct, large-scale drifts in salinity distribution over the 35-year period.

Key Findings

  • IFS-NEMO-ER exhibits a catastrophic, full-column positive salinity drift (salinification) that exceeds +0.2 PSU almost immediately, indicating a fundamental freshwater budget imbalance.
  • IFS-FESOM2-SR displays a strong vertical dipole bias: significant freshening in the upper 1000m (< -0.2 PSU) and gradual salinification in the deep ocean (> 2000m), suggesting issues with vertical mixing or salt redistribution.
  • ICON-ESM-ER shows a fresh bias in the upper 800m similar to FESOM but maintains a much more stable deep ocean, making it the closest to observations below 1000m.
  • Observational data (EN4) shows negligible drift relative to its initial state, confirming the stability of the reference dataset.

Spatial Patterns

The vertical structure of the drift varies significantly by model: IFS-NEMO-ER shows a uniform drift throughout the water column, whereas IFS-FESOM2-SR and ICON-ESM-ER show surface-intensified signals that decay or reverse with depth. Seasonal cycles are clearly visible in the top ~100m for all models.

Model Agreement

Inter-model agreement is very low. The models diverge in both the sign and vertical structure of their drifts. IFS-NEMO-ER is a clear outlier with its massive positive bias, while FESOM and ICON share a tendency towards upper-ocean freshening.

Physical Interpretation

The severe salinification in IFS-NEMO-ER strongly suggests a non-conservative freshwater budget, potentially due to missing river runoff or an imbalance between evaporation and precipitation/runoff in the coupling. The fresh surface/salty deep pattern in IFS-FESOM2-SR implies a 'gradual' spindown or adjustment where the model stratifies too strongly, trapping freshwater at the surface and preventing adequate vertical exchange. The upper-ocean freshening seen in FESOM and ICON is a common coupled model bias often related to excessive precipitation or errors in sea-ice melt/runoff distribution.

Caveats

  • The color scale saturates at ±0.2 PSU, likely hiding the full magnitude of the drift in IFS-NEMO-ER.
  • Global averaging masks potential regional differences (e.g., Atlantic vs. Pacific basin contrasts).
  • It is assumed models were initialized close to the EN4 reference; if not, initial offsets contribute to the anomaly.

Salinity Surface Annual Mean Bias

Salinity Surface Annual Mean Bias
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014
IFS-FESOM2-SR Global Mean Bias: -0.18 · Rmse: 0.69
IFS-NEMO-ER Global Mean Bias: 0.16 · Rmse: 0.72
ICON-ESM-ER Global Mean Bias: -0.19 · Rmse: 1.19

Summary high

This figure evaluates annual mean surface salinity (SSS) biases in three high-resolution coupled models (IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER) relative to EN4 v4.2.2 observations. While all models struggle significantly with Arctic freshwater representation, exhibiting strong positive biases, they diverge in their global mean states and tropical bias patterns.

Key Findings

  • **Systematic Arctic Salinity Bias:** All three models exhibit a severe positive (salty) bias in the Arctic Ocean, often exceeding +3 PSU. This indicates a common failure to maintain the observed fresh surface layer, likely due to vertical mixing parameterizations or deficiencies in river runoff/sea ice melt distribution.
  • **Divergent Global Means:** The models do not agree on the sign of the global mean bias. IFS-NEMO-ER is globally too salty (+0.16 PSU), particularly in the Pacific and Southern Oceans, whereas IFS-FESOM2-SR (-0.18 PSU) and ICON-ESM-ER (-0.19 PSU) are globally too fresh.
  • **ICON-ESM-ER Regional Extremes:** ICON displays the highest RMSE (1.19 PSU) driven by distinct regional biases: intense freshening in the Indian Ocean/Maritime Continent (likely precipitation-driven) and the subpolar North Atlantic, contrasted with positive biases in the Southern Ocean.
  • **Mediterranean Outflow:** Strong localized salty biases are visible in the Mediterranean and Red Seas (especially in IFS-NEMO and IFS-FESOM), suggesting issues with marginal sea exchange or evaporation rates.

Spatial Patterns

The dominant spatial feature is the stark contrast between the salty Arctic and the rest of the ocean. In the tropics, IFS-NEMO-ER shows a broad salty bias across the Pacific, while ICON-ESM-ER shows a massive fresh plume extending from the Bay of Bengal through the Indonesian Throughflow region. IFS-FESOM2-SR shows a more spatially heterogeneous pattern with fresh biases in the Eastern Pacific and Southern Ocean but salty anomalies in the North Atlantic.

Model Agreement

Models show high agreement in the Arctic (all too salty) and generally agree on positive biases in the Mediterranean/Red Sea. They disagree significantly in the Southern Ocean (IFS-FESOM fresh vs. ICON/NEMO salty) and the Indian Ocean (ICON strongly fresh vs. others neutral/salty).

Physical Interpretation

The pervasive Arctic salty bias suggests that high-resolution models still struggle to preserve the halocline, possibly mixing surface freshwater downwards too efficiently or lacking sufficient riverine input. ICON's strong freshening in the Asian Monsoon region is likely a fingerprint of excessive precipitation (positive P-E bias) often found in coupled models' convective parameterizations. The salty bias in IFS-NEMO-ER might reflect an imbalance in the global hydrological cycle (P-E < 0 globally) or insufficient runoff.

Caveats

  • EN4 data density in the Arctic is lower than in lower latitudes, though the bias magnitude (>3 PSU) far exceeds observational uncertainty.
  • Surface salinity is strongly coupled to precipitation biases, so errors here may diagnose atmospheric model deficiencies rather than ocean model physics.
  • Metadata lists units as 'K', but the variable is Salinity (PSU).

Salinity Surface DJF Bias

Salinity Surface DJF Bias
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014
IFS-FESOM2-SR Global Mean Bias: -0.18 · Rmse: 0.77
IFS-NEMO-ER Global Mean Bias: 0.15 · Rmse: 0.73
ICON-ESM-ER Global Mean Bias: -0.18 · Rmse: 1.19

Summary high

This diagnostic evaluates December-January-February (DJF) surface salinity biases in three high-resolution coupled models against EN4 observations. IFS-NEMO-ER performs best globally with the lowest RMSE (0.73 PSU), while ICON-ESM-ER exhibits the largest errors (RMSE 1.19 PSU) characterized by widespread fresh biases.

Key Findings

  • All three models exhibit prominent positive (salty) biases in the Arctic Ocean and marginal seas (e.g., Barents/Kara), likely related to sea ice processes or river runoff parameterization.
  • ICON-ESM-ER displays a strong, extensive fresh bias in the North Atlantic subpolar gyre and Gulf Stream extension, significantly contributing to its higher global RMSE.
  • A divergence in tropical biases is evident: IFS-NEMO-ER tends towards positive (salty) biases in subtropical gyres, whereas ICON-ESM-ER and IFS-FESOM2-SR show negative (fresh) biases in the ITCZ and Indo-Pacific warm pool.

Spatial Patterns

The Arctic generally shows positive biases across all models. In the tropics, ICON-ESM-ER and IFS-FESOM2-SR show freshening along precipitation bands (ITCZ/SPCZ) and in the Bay of Bengal, while IFS-NEMO-ER is saltier in the open ocean but shows distinct narrow fresh biases along major eastern boundary upwelling systems (California, Peru-Chile, Benguela). The Mediterranean and Red Sea show consistent strong salty biases across all simulations.

Model Agreement

Inter-model agreement is low regarding the sign of biases in the tropical and subtropical oceans, with IFS-NEMO-ER generally opposing the fresh tendency of the other two. Agreement is higher in the semi-enclosed seas (Mediterranean, Red Sea, Baltic) and high-latitude shelf regions where salty biases dominate.

Physical Interpretation

The fresh biases in ICON and FESOM in the ITCZ and Bay of Bengal likely stem from excessive precipitation (common in coupled models) or river runoff spreading issues. The specific fresh biases in IFS-NEMO-ER's upwelling zones suggest either biases in the source water masses (Intermediate Water) or local air-sea flux errors. The strong fresh bias in the North Atlantic for ICON suggests potential issues with the path of the North Atlantic Current or excessive freshwater export from the Arctic/Labrador Current, potentially impacting deep water formation sites.

Caveats

  • Observational density in the Arctic (EN4) is low, particularly under ice, increasing uncertainty in high-latitude bias assessment.
  • Strong coastal biases (e.g., Mediterranean, Red Sea) may be partially attributable to resolution limitations in resolving narrow straits even at these eddy-rich resolutions.

Salinity Surface JJA Bias

Salinity Surface JJA Bias
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014
IFS-FESOM2-SR Global Mean Bias: -0.19 · Rmse: 0.77
IFS-NEMO-ER Global Mean Bias: 0.18 · Rmse: 0.84
ICON-ESM-ER Global Mean Bias: -0.20 · Rmse: 1.29

Summary high

This figure evaluates JJA sea surface salinity (SSS) biases in three high-resolution coupled models (IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER) relative to EN4 v4.2.2 observations. Biases are heterogeneous, with IFS-NEMO-ER showing a global salty tendency (+0.18 PSU) while IFS-FESOM2-SR and ICON-ESM-ER exhibit global fresh biases (-0.19 and -0.20 PSU, respectively).

Key Findings

  • All three models exhibit distinct positive (salty) biases in the Arctic Ocean, particularly in the central basin and shelf seas, potentially linked to sea-ice formation (brine rejection) or insufficient river runoff propagation.
  • A prominent fresh bias exists in the North Atlantic subpolar gyre and Gulf Stream extension region across all models, most severe in ICON-ESM-ER and IFS-FESOM2-SR.
  • ICON-ESM-ER displays widespread and strong fresh biases in the tropical Indo-Pacific and Bay of Bengal, contributing to it having the highest RMSE (1.29 PSU).
  • IFS-NEMO-ER shows a unique positive (salty) bias in the Amazon plume region, suggesting under-represented freshwater discharge or excessive mixing, whereas the other models are closer to neutral or fresh in this region.

Spatial Patterns

The Arctic is consistently salty across models. The North Atlantic features a characteristic 'fresh blob' bias. In the tropics, IFS-NEMO-ER tends towards salty biases in the subtropical gyres, while ICON-ESM-ER and IFS-FESOM2-SR show fresh biases along the ITCZ and in the Indo-Pacific warm pool. Extreme positive biases are visible in semi-enclosed seas like the Red Sea and Persian Gulf, likely resolution or masking artifacts.

Model Agreement

Models agree on the sign of biases in the Arctic (positive) and North Atlantic subpolar region (negative). They diverge significantly in the tropics: IFS-NEMO-ER is generally saltier (positive bias), while ICON-ESM-ER is strongly fresh (negative bias). IFS-FESOM2-SR falls in between with lower RMSE (0.77 PSU) than the others.

Physical Interpretation

Tropical fresh biases (especially in ICON-ESM-ER) likely result from excessive precipitation (e.g., double ITCZ bias). The North Atlantic fresh bias is a common systematic error often linked to Gulf Stream separation issues and weak transport of saline waters to high latitudes. The Amazon plume salty bias in IFS-NEMO-ER indicates deficient river runoff spreading. High Arctic salty biases may stem from model deficiencies in freshwater budget (runoff/melt) or observational sparsity in EN4 under ice.

Caveats

  • Observational uncertainty in the Arctic (EN4) is high due to sparse sampling under sea ice.
  • Coastal biases (e.g., Red Sea) may be exacerbated by regridding and land-sea mask mismatches.

Temperature Surface Annual Mean Bias

Temperature Surface Annual Mean Bias
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014
IFS-FESOM2-SR Global Mean Bias: -0.19 · Rmse: 0.91
IFS-NEMO-ER Global Mean Bias: -1.05 · Rmse: 1.29
ICON-ESM-ER Global Mean Bias: -0.36 · Rmse: 1.70

Summary high

This figure evaluates annual mean sea surface temperature (SST) biases in three coupled models relative to EN4 observations. IFS-FESOM2-SR demonstrates the best overall performance with low global bias, whereas IFS-NEMO-ER is systematically cold and ICON-ESM-ER exhibits large-magnitude regional errors associated with western boundary currents.

Key Findings

  • IFS-FESOM2-SR has the lowest RMSE (0.91 K) and a small global cold bias (-0.19 K), though it retains systematic warm biases in eastern boundary upwelling zones (Benguela, Humboldt).
  • IFS-NEMO-ER is dominated by a pervasive global cold bias (mean -1.05 K), which is particularly intense in the North Atlantic subpolar gyre and Southern Ocean.
  • ICON-ESM-ER shows the highest RMSE (1.70 K) despite a moderate mean bias, driven by extreme warm biases (>3°C) in the Gulf Stream and Kuroshio extensions contrasting with a broad cold bias in the tropical Pacific.
  • The Southern Ocean exhibits divergent behavior: IFS-FESOM2-SR and ICON-ESM-ER show warm biases, while IFS-NEMO-ER shows a strong circum-Antarctic cold bias.

Spatial Patterns

Biases are spatially distinct across models. IFS-FESOM2-SR shows typical 'double-ITCZ' or eastern boundary warm bias signatures. IFS-NEMO-ER shows a relatively uniform cold shift enhanced in high latitudes. ICON-ESM-ER displays a high-contrast pattern with intense heating along Western Boundary Current paths (North Atlantic, North Pacific) and cooling in the tropical cold tongue regions.

Model Agreement

There is low inter-model agreement on regional bias patterns, particularly in the Southern Ocean and North Atlantic, suggesting different structural errors in ocean dynamics or parameterizations. All models tend to show some degree of cold bias in the central North Pacific.

Physical Interpretation

ICON-ESM-ER's strong warm biases in the North Atlantic and Pacific likely result from issues with Western Boundary Current separation (e.g., the Gulf Stream staying attached to the coast too far north) or excessive eddy heat transport. The warm biases in eastern boundary upwelling regions (visible in IFS-FESOM2-SR and ICON) are a classic coupled model problem often linked to insufficient coastal upwelling resolution or under-representation of stratocumulus clouds. IFS-NEMO-ER's global cold bias suggests a systemic issue with energy balance, potentially related to radiative tuning or vertical mixing efficiency.

Caveats

  • The analysis is based on annual means, which may mask seasonal bias variability.
  • The 'SR' vs 'ER' designations imply different resolutions, which complicates direct dynamical comparisons (e.g., eddy-resolving vs eddy-permitting physics).

Temperature Surface DJF Bias

Temperature Surface DJF Bias
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014
IFS-FESOM2-SR Global Mean Bias: -0.05 · Rmse: 1.29
IFS-NEMO-ER Global Mean Bias: -1.04 · Rmse: 1.32
ICON-ESM-ER Global Mean Bias: -0.30 · Rmse: 2.09

Summary high

This figure evaluates the climatological sea surface temperature (SST) biases for the DJF season (1980–2014) in three high-resolution models relative to EN4 observations.

Key Findings

  • IFS-NEMO-ER exhibits a pervasive global cold bias (mean -1.04°C), unlike the other two models.
  • IFS-FESOM2-SR displays the best statistical performance (lowest RMSE of 1.29°C and near-zero global mean bias), though it shows distinct regional errors.
  • ICON-ESM-ER has the largest regional discrepancies (RMSE 2.09°C), characterized by intense cold biases in the North Atlantic and strong warm biases in the Southern Ocean.

Spatial Patterns

All models show a 'cold blob' bias in the North Atlantic subpolar gyre, most severe in ICON-ESM-ER and IFS-FESOM2-SR. In the Southern Ocean (austral summer), IFS-FESOM2-SR and ICON-ESM-ER exhibit a pronounced warm bias band (> +2°C to +4°C) encircling Antarctica, whereas IFS-NEMO-ER remains neutral to cold there. Eastern boundary upwelling regions (e.g., off Africa and South America) show warm biases in IFS-FESOM2-SR and ICON-ESM-ER.

Model Agreement

There is strong inter-model agreement on the presence of a North Atlantic cold bias, indicative of common structural errors in western boundary current separation. The models diverge significantly in the Southern Ocean, where IFS-NEMO-ER avoids the strong warm bias seen in the other two. IFS-NEMO-ER is uniquely dominated by a background cooling signal.

Physical Interpretation

The North Atlantic cold bias typically results from the Gulf Stream separating too late or following a path that is too zonal, preventing warm water from reaching the subpolar gyre—a persistent issue even at eddy-permitting resolutions. The Southern Ocean warm bias in DJF likely stems from excessive shortwave radiation reaching the surface due to cloud deficits (too few bright clouds) or mixed-layer depth biases during the austral summer. The eastern boundary warm biases suggest under-resolved coastal upwelling processes.

Caveats

  • The analysis is limited to the DJF season; biases in the Southern Ocean may differ significantly in austral winter.
  • IFS-NEMO-ER displays minor grid-like striping artifacts in the bias plot, likely a post-processing or regridding effect rather than physical signal.

Temperature Surface JJA Bias

Temperature Surface JJA Bias
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014
IFS-FESOM2-SR Global Mean Bias: -0.31 · Rmse: 1.07
IFS-NEMO-ER Global Mean Bias: -1.06 · Rmse: 1.39
ICON-ESM-ER Global Mean Bias: -0.41 · Rmse: 1.92

Summary high

This diagnostic figure evaluates June-August (JJA) sea surface temperature biases for three coupled climate models against EN4 observations, revealing that IFS-FESOM2-SR achieves the lowest global error while the eddy-rich models (IFS-NEMO-ER and ICON-ESM-ER) exhibit larger systematic or regional biases.

Key Findings

  • IFS-NEMO-ER is dominated by a strong, widespread global cold bias (mean -1.06°C), particularly severe in the North Atlantic and North Pacific basins.
  • ICON-ESM-ER exhibits the highest spatial variability (RMSE 1.92°C) with intense warm biases (>4°C) in Western Boundary Current extensions (Gulf Stream, Kuroshio) and the South Atlantic (Benguela) upwelling region.
  • IFS-FESOM2-SR performs best statistically (RMSE 1.07°C, mean bias -0.31°C), showing a more balanced bias distribution, though it retains a notable cold bias in the North Atlantic subpolar gyre.

Spatial Patterns

ICON-ESM-ER shows distinctive dipolar or warm biases in Western Boundary Currents, suggesting current separation or overshoot issues. It also shows a classic severe warm bias in the Benguela eastern boundary upwelling system. IFS-NEMO-ER shows a pervasive cold signal with few warm features (limited mostly to Southern Hemisphere patches). IFS-FESOM2-SR shows a 'warming hole' cold bias in the North Atlantic but warm biases in the Southern Ocean.

Model Agreement

Inter-model agreement is low. IFS-NEMO-ER and ICON-ESM-ER diverge significantly in bias structure; the former is systematically cold, while the latter is characterized by extreme regional warm anomalies in dynamically active zones.

Physical Interpretation

The pervasive cold bias in IFS-NEMO-ER implies a global energy budget offset, possibly linked to cloud radiative tuning or vertical mixing. ICON-ESM-ER's warm biases in boundary currents suggest excessive poleward heat transport or misplaced currents (overshoot), while the Benguela bias points to deficiencies in upwelling dynamics or stratocumulus cloud decks (too much solar heating). The superior performance of IFS-FESOM2-SR relative to the 'ER' (likely higher resolution) variants suggests that resolution increase alone does not automatically correct mean-state biases without retuning.

Caveats

  • The analysis is limited to JJA (Northern Hemisphere summer), potentially emphasizing summer stratification biases.
  • The 'SR' vs 'ER' designation suggests resolution differences (Standard vs Eddy-Rich) that might complicate direct comparison; notably, the likely lower-resolution IFS-FESOM2-SR outperforms the eddy-rich configurations.

Temperature Depth-Layer Time Series

Temperature Depth-Layer Time Series
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER, EN4
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014

Summary high

This figure displays the time evolution of global volume-weighted mean ocean temperature across three depth layers (0–700 m, 700–2000 m, and >2000 m) from 1980 to 2015, comparing three high-resolution coupled models against EN4 observations.

Key Findings

  • Upper Ocean (0–700 m): Significant mean-state biases exist; ICON-ESM-ER has a warm bias (~0.2°C), while both IFS models (FESOM2-SR and NEMO-ER) exhibit a strong cold bias (~0.5°C). All models reproduce the observational warming trend.
  • Deep Ocean Drift (>2000 m): Models show substantial linear drifts indicating non-equilibrium states. ICON-ESM-ER warms continuously; IFS-FESOM2-SR warms rapidly (crossing the observational line); conversely, IFS-NEMO-ER shows a persistent cooling trend.
  • Intermediate Ocean (700–2000 m): Biases mirror the deep ocean, with ICON too warm and IFS-NEMO too cold. Notably, IFS-NEMO-ER displays an unrealistic cooling trend in this layer, diverging from the stable observations.
  • ICON-ESM-ER is consistently the warmest model across the entire water column, while IFS-NEMO-ER is generally the coldest.

Spatial Patterns

As a global time series, the primary patterns are temporal trends. The upper ocean (0–700 m) shows a forced warming signal superposed on mean biases. The deep ocean (>2000 m) is dominated by linear drifts rather than interannual variability, with slopes much steeper than the observational trend (which is near-flat), indicating significant model drift.

Model Agreement

There is low agreement on mean state absolute values (spread of ~0.6°C in the upper ocean). While models agree on the sign of warming in the upper ocean, they completely diverge in the deep ocean, with some warming (ICON, IFS-FESOM) and others cooling (IFS-NEMO).

Physical Interpretation

The strong drifts in the deep ocean suggest initialization shocks or insufficient spin-up time, where the model deep ocean is adjusting to surface fluxes or internal mixing physics that differ from the initialization climatology. The cooling drift in IFS-NEMO-ER (intermediate and deep) is particularly notable as it opposes the expected signal of anthropogenic heat uptake. The consistent cold bias in the IFS upper ocean may stem from surface flux biases or cloud radiative effects.

Caveats

  • The strong linear drifts in the deep ocean obscure the detection of any anthropogenic warming signal in those layers.
  • The large mean-state biases in the upper ocean (0–700 m) complicate the direct comparison of heat content anomalies.

Temperature Hovmoller (first-timestep anomaly)

Temperature Hovmoller (first-timestep anomaly)
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014

Summary high

Time-depth Hovmoller diagrams of global-mean ocean temperature anomalies relative to the initial timestep (1980) for EN4 observations and three high-resolution models.

Key Findings

  • EN4 observations show the expected climate change signal: surface-intensified warming penetrating to ~800m over the 1980-2014 period, with a relatively stable deep ocean.
  • IFS-FESOM2-SR and ICON-ESM-ER exhibit significant broad-scale warming drifts throughout the water column (down to 6000m), with ICON showing a particularly strong warm bias centered around 1000m.
  • IFS-NEMO-ER displays a distinctive cooling drift at intermediate depths (1000-3000m), contrasting sharply with the warming trends seen in the other two models and the observations.
  • All models capture the seasonal cycle in the upper ocean, but their sub-surface long-term evolution is dominated by differing initialization drifts.

Spatial Patterns

EN4 shows a realistic propagation of surface warming downwards over time. IFS-FESOM2-SR shows pervasive warming at all depths. IFS-NEMO-ER shows a 'sandwich' pattern: surface warming, intermediate cooling (1000-3000m), and neutral deep waters. ICON-ESM-ER shows a diffusive warming pattern peaking in the thermocline/intermediate waters (~1000m).

Model Agreement

Models diverge significantly in the deep ocean (>1000m). While all models reproduce surface seasonality and surface warming trends to some extent, the signs of the intermediate-depth drift disagree (NEMO cools, FESOM/ICON warm).

Physical Interpretation

The plots reveal initialization shock and intrinsic model drift. Since anomalies are relative to the first timestep, they conflate the forced climate change signal (seen in EN4) with model equilibration. The deep warming in FESOM and ICON suggests an energy imbalance or excessive vertical heat uptake (mixing) relative to the initial state. IFS-NEMO's intermediate cooling might stem from water mass adjustment errors (e.g., related to Antarctic Intermediate Water or NADW formation) or insufficient vertical mixing compared to the initial climatology.

Caveats

  • Without a control run, separating forced climate trends from model drift is difficult.
  • The square-root-like depth scale visually compresses the deep ocean, potentially de-emphasizing the total heat content error in the large deep volume.

Temperature Hovmoller (EN4-ref anomaly)

Temperature Hovmoller (EN4-ref anomaly)
Variables thetao, so
Models IFS-FESOM2-SR, IFS-NEMO-ER, ICON-ESM-ER
Obs Dataset EN4 v4.2.2
Units K
Period 1980–2014

Summary high

Time-depth Hovmoller diagrams showing the evolution of global mean ocean temperature anomalies (relative to the EN4 initial reference profile) for three high-resolution coupled models compared to EN4 observations.

Key Findings

  • All three models exhibit a rapid and persistent cold drift in the upper ocean (0–500 m) relative to the EN4 reference, with magnitudes (<-1.5°C) significantly exceeding the observed climate warming signal.
  • ICON-ESM-ER displays a unique vertical dipole bias structure: a strong cold anomaly in the upper 500 m contrasted by a prominent warm anomaly band between 500 m and 2000 m.
  • IFS-FESOM2-SR shows the most intense near-surface cold bias, sharply confined to the upper ~500 m, while IFS-NEMO-ER exhibits a more diffuse cold bias extending down to ~1500 m.
  • The observational panel (EN4) confirms a slight upper-ocean warming trend over the 1980–2014 period, which is completely overwhelmed by the strong cold drift in the model simulations.

Spatial Patterns

The primary patterns are vertical. IFS simulations show a monotonic decay of the cold bias with depth, stabilizing below 2000 m. ICON shows a complex structure with a cold surface layer, a warm intermediate layer (500-2000 m), and slight abyssal warming (>4000 m). Strong vertical striping in the upper ocean of the model panels indicates a seasonal cycle in the bias (or a seasonal cycle in the model not present in the static reference).

Model Agreement

Models agree on the sign of the surface drift (cooling), indicating a common systematic error in the surface energy budget or upper-ocean mixing at this resolution. They disagree significantly on subsurface behavior, particularly the intermediate-depth warming seen only in ICON.

Physical Interpretation

The pervasive surface cold bias suggests a negative net surface energy imbalance, potentially driven by excessive low cloud reflection or insufficient solar absorption, which is a common tuning challenge in high-resolution coupled models. The rapid onset implies an initialization shock. ICON's intermediate warming suggests excessive downward heat transport or issues with the formation and ventilation of intermediate water masses (e.g., anomalies in overflow parameterizations).

Caveats

  • Global averaging obscures regional bias patterns (e.g., North Atlantic vs. Tropical Pacific).
  • Anomalies are relative to the first timestep, conflating model drift with actual interannual variability and climate trends.