Understanding the climate system is difficult given its dependence on a wide range of length and timescales. Analyzing Earth system model (ESM) data, such as that from the Energy Exascale Earth System Model (E3SM), is further complicated due to the complex interactions between different components of the Earth’s system model. Indeed, most of climate science focuses on single component analysis, e.g. ocean or sea ice. At present, ESMs are constructed to understand average statistics, such as total conservation of heat and moisture. However, sophisticated dynamics such as heat and moisture transport in coupled system simulations remains elusive. Discovering sophisticated latent climatic signals in fully coupled E3SM data is a challenging task for a number of reasons. Principal among them, we currently do not fully understand the latent signatures within each component, or how these signals will be amplified when information is shared between components.
In this project, we developed an interpretable unsupervised machine learning technique for diagnostics of the E3SM MPAS-O model. Here, we have temperature and salinity data round The Southern Ocean. Depending on the length of the simulation, one can incur model discrepancies. Ideally we would like to automatically identify physically relevant instabilities within the model. Pinpointing how key water masses evolve with time can help provide these insights. We preform an unsupervised clustering technique to separate the data into classes (Gaussian mixture model). We then analyze where these classes reside by depth. By looking at key overlapping depth regions, one can interpret the vertical mixing and identify which clusters belongs to the important water masses. Comparing the water masses across two time periods provides insight into how the model has drifted over time. These insights are leveraged alongside traditional nonnegative tensor factorization techniques to discover latent climate signatures.