Watch your proxy
Replacing observational data with a modeled proxy creates questions about logic
Some things cannot be studied directly and so a proxy may be used as a stand in. It is difficult to measure student aptitude or studiousness so, a common proxy in which we are all familiar is student GPA. People argue about what GPA actually communicates though.
We cannot directly observe the weather of thousands of years ago so as to describe the climate. Instead, scientists examine isotopes as a proxy for temperature to estimate the long ago climate.
The reason GPA and isotopes are generally accepted as proxies is their transparency. A student’s GPA and their underlying grades are documented over the course of their schooling. Anyone can calculate their GPA. Measurements of chemical isotopes rests on a well understood aspect of chemistry using a mass spectrometer- a common machine in labs around the world. The use of isotopes as a proxy in the study of past climate is common and has been for a long time.
But what if instead of using an individual student’s GPA or measurements of isotopes as proxies, we started using predictions of GPA and isotopes as proxies? That seems like a very different situation.
Suddenly, the proxy is not so transparent and any research finding is contingent on whatever the assumptions are in the creation of the proxy. Whoever controls the model controls the proxy and thereby heavily influences the potential conclusions drawn about individual students and past climates.
A paper in Nature Climate Change crossed me desk that drew my attention for this reason: Observations reveal changing coastal storm extremes around the United States.
The title suggests that something has been discovered by examining observational data. Big discoveries, actually.
From the abstract:
Here, using a spatiotemporal Bayesian hierarchical framework, we analyse US tide gauge record for 1950–2020 and find that observational estimates have underestimated likelihoods of storm surge extremes at 85% of tide gauge sites nationwide. Additionally, and contrary to prevailing beliefs, storm surge extremes show spatially coherent trends along many widespread coastal areas, providing evidence of changing coastal storm intensity in the historical monitoring period.
Not only do the authors substantially increase the probabilities of storm surge extremes and storm intensity they also overturn a widespread understanding among scientists in this field.
The use of “observational estimates” is a bit of poetic license, however. The study makes its findings through the use of modeled proxies which leaves one to wonder what, if anything, is actually found.
Let’s take a look.
The premise of the study sets up the problem to be addressed and it’s underlying logic.
Currently, there is a strong scientific consensus on historical long-term MSL [mean sea level] changes from observational data and data-extended reconstructions (6–8) and how such changes have made extreme sea levels more frequent (9,10).
However, estimates of extreme storm surge events remain highly uncertain (11). Furthermore, there is a lack of observational evidence about their underlying long-term changes (12–21), which is a proxy for changes in coastal landfalling storms (21–23), including hurricanes (24, 25).
According to the authors there are, as of yet, undetectable (and therefore assumed) changes in storm surge and landfalling hurricanes and these are important when using storm surge as a proxy for landfalling hurricanes.
Of course, this also assumes that it is appropriate to use storm surge as a proxy for landfalling storms/hurricanes instead of just studying landfalls directly. 1
The set-up here is that predictions of extreme storm surge are more reliable for estimating hurricane landfall rates than the observational record of landfalls. The purpose of the paper is therefore to find the changes in extreme storm surge so that new probability estimates of extreme storm surge and hurricane landfalls can be calculated.
The authors say as much at the end of the paper:
When interpreted as a proxy of changes in coastal storm intensity, our results add a new perspective to the ongoing scientific enquiry into the long-term changes in storm activity along the US Atlantic coast using historical best-track data, downscaled reanalysis and model projections, and how that could lead to changes in extreme storm surge climate.
Let’s look at the methodology.
A key facet of the methodology presented in this paper is the use of the BAYEX model with something called the “surge skew.” The authors write:
We inform BAYEX using historical time series of annual maxima skew surge here estimated from hourly sea-level records across 208 coastal tide gauge locations, leading to >6,900 annual maxima in total (Extended Data Figs. 1 and 2) (see Methods for extended description of tide gauge observations used and model description).
Skew surge, defined as the absolute difference between the observed high-water level and closest predicted high tide regardless of their timing over the tidal cycle 31, has been shown to be a robust metric of storm surge in different tidal regime (Methods) 31; henceforth, it is used synonymously with storm surge. [emphasis mine]
Skew surge may or may not be an appropriate means to address limits in surge data itself. However, most significantly here is to notice that the observed surge data has been replaced with the proxy, “skew surge” and this measure is dependent on the predicted values produced by BAYEX.
The difference between what BAYEX predicts and what would otherwise be estimated via an “at-site estimate” (which isn’t well defined in the paper) is in some cases large and in some cases not so different. You can see the comparison for some sites in the image below found in the Supplementary Information Figure 3. The pink is BAYEX and the green is at-site.
The authors report that the surge skew produces dramatically different estimates of extreme storm surge than one would find using just the observational record. Use of the BAYEX dependent proxy produces between a doubling to a six fold increase in the probability of an event’s occurrence or exceedance.
These results show that extreme surge events currently estimated to have a 1% AEP are, in fact, far more likely to occur in any given year in many coastal places; when a median is taken across tide gauge sites, extreme surge events that have a 1% AEP (100-year return period) based on at-site analysis have in fact a 2.13% AEP (47-year return period) within any given year. For example, hurricane Helene (2024) caused a skew surge of 1.91 m at St. Petersburg (Florida). From traditional at-site estimates, this corresponds to a 1,450-year return period event whereas according to BAYEX this is over six times more likely to occur (225-year return period event).
I’m no statistician, but it seems to me that if you are estimating a two to six fold increase by way of a proxy, there should be some rather notable trend in the underlying observations themselves.
The headline- the one captured in the title of the paper- is that something was uncovered in the observational data. That doesn’t appear to be the case here though.
The authors invented an entirely new data series which is dependent on modeled prediction to use as a proxy. The findings are drawn from the proxy.
This matters for how the authors seek to use the data to inform coastal risk management and, of course, detection and attribution studies.
The authors write,
Importantly, the observational storm surge trend patterns we detect probably reflect long-term changes in associated atmospheric forcing (due to combined effects of multidecadal climate variability and anthropogenic forcing) and possible interactions with rising MSLs. However, a robust attribution is not possible based on the limited (historical) climate information provided within the observational data. This would require (large) ensemble climate simulations, using high-resolution ocean and atmospheric forcing from detection and attribution climate experiments (such as single model initial-condition large ensemble based on multiple models), downscaled into (long) time series of storm surge extremes to: (1) separate anthropogenic and natural forcing contributions and (2) quantify underlying interactions between changes in MSL and storm surge extremes. This type of assessment, while beyond the scope of this study, could help address, for example, the challenging scientific question of attributing changes in storm activity over the North Atlantic basin where (long-term) changes have been attributed to a combination of natural climate variability and anthropogenic aerosol emissions, which potentially obscured GHG warming contributions.
This proposes that the study of storm surge and hurricanes be mediated through the use of the surge skew proxy which is a product of the BAYEX model.
I want to emphasize the argument presented in this paper:
observed storm surge data is incomplete and presents uncertainty
observed hurricane data presents uncertainties
We will use a model to produce a proxy for storm surge data to use as a proxy for hurricane data.
This line of reasoning presents a whole mess of new uncertainties that are difficult to track, measure, and make transparent and far exceed the rather basic and transparent uncertainties about the observed record itself.
Why bother with this?
This study has the potential to have a long shelf life in ways that are hard to overcome and shed light on.
This is because the methods proposed here can and I suspect will become part of the risk modeling practices of private industry and their marketing campaigns, as well as advocacy campaigns, and both then become part of the media stream.
What people will see is that the risk estimate uses methods published in Nature Climate Change, a elite journal outlet. The underlying premise and assumptions- however contrived- gets lost in the commotion.
I welcome feedback.
There is an implicit argument here about the observational record of hurricane data. This is what the authors are getting at in the the references below and it is also a reason I was interested in the paper. There are a select few that argue for truncating the full landfalling data series. Coincidentally this supports select narratives about hurricane risk and climate change, which select business and advocacy interests may find useful.
#21 uses surge as a proxy for storms. The authors argue that the long term trend in the surge data is attributed to mean sea level rise. In accounting for SLR, they find no long term trend in storms.
I am cited at #22 on work with Roger Pielke, Jr and Ryan Maue on a paper counting hurricane landfalls around the world. We find no trend overall and a trend in the Southern ocean off of Africa, where the data series was short. We don’t do anything with storm surge. Roger and Ryan update that data regularly.
#23 also does not use surge as a proxy for storms, does not mention surge, and finds no long term trends in hurricane landfalls.
#24 has a fun history here. The authors do use storm surge as a proxy of hurricane landfalls and in doing so develop a estimate 3x as many hurricane landfalls as the observed record.
#25 is a study of normalized hurricane damage.
Hi Jessica,
Thank you for covering our work. A few months ago, I came across your blog, and I have been appreciating the different perspectives you bring—especially in a field that can at times feel like an echo chamber.
I wanted to offer a bit of clarification for you and your readers, as I think there may have been some misunderstanding about the scope and intent of our work.
The issue that seems to have drawn the most attention relates to whether our findings imply changes in the frequency or intensity of landfalling storms. I fully agree that using proxies to detect changes in hazards can be problematic, and we were careful not to overstate our findings in that regard. Unfortunately, that nuance may not have come through clearly.
You write: “The set-up here is that predictions of extreme storm surge are more reliable for estimating hurricane landfall rates than the observational record of landfalls. The purpose of the paper is therefore to find the changes in extreme storm surge so that new probability estimates of extreme storm surge and hurricane landfalls can be calculated.”
That is not an accurate characterization of our intent—and I agree it would be quite a stretch for us to claim something like that.
You may be reading too much into this sentence from our paper: “When interpreted as a proxy of changes in coastal storm intensity, our results add a new perspective to the ongoing scientific enquiry into the long-term changes in storm activity along the US Atlantic coast using historical best-track data, downscaled reanalysis and model projections, and how that could lead to changes in extreme storm surge climate.”
Our intent of this sentence was to suggest that, if storm surge extremes are interpreted in that way, they offer another line of evidence—imperfect, of course—but one that complements other sources of information. We are not suggesting storm surge is a direct or exclusive proxy for hurricane landfalls. Storm surge is influenced by both tropical and extratropical systems, and significant events can occur even when hurricanes stay offshore.
While we do offer possible mechanisms and some interpretation, we stop well short of making any causal claims about the trends we detect. It would be incorrect to interpret our findings as evidence of changes in hurricane behavior, and we explicitly state this in the paper. That said, some of the coverage and discussion on social media has framed it that way, which is unfortunate—though, as you know, not uncommon in our field.
There also seems to be some misunderstanding regarding our use of skew surge, particularly the suggestion that “the observed surge data has been replaced with the proxy, ‘skew surge.’”
To clarify: skew surge is not a proxy. It is a direct measure of storm surge, and has been used for decades by engineers and coastal scientists to quantify storm-driven anomalies in water levels. No modeled or synthetic data is used—our analysis is based entirely on observed skew surge values at tide gauges.
Skew surge isolates the meteorological component of storm surge by removing the astronomical tide and mean sea level from the total water level. This makes it more suitable for statistical distribution fitting, where including coincidental astronomical tides would obscure the actual meteorological signal. While the term may sound abstract, skew surge is in fact one of the most straightforward and physically meaningful ways to isolate storm-driven coastal water level anomalies. We recognize that this distinction may have caused some confusion, and we’ll aim to be clearer in future work.
Additionally, we acknowledge that the phrase “at-site estimate” may not have been adequately defined in the paper. By at-site, we simply mean that estimates are based only on data from the tide gauge itself. In contrast, BAYEX pools data from multiple regional sites to better characterize the local storm surge climatology—much like how NOAA pools rainfall data to produce extreme precipitation maps (e.g., the Atlas series). The difference is that our method doesn’t rely on predefined regional boundaries and instead incorporates parameter uncertainty directly.
Pooling data helps capture the tail behavior of extreme surge distributions—events that may not yet have occurred at a specific tide gauge but are plausible over long time horizons. In a forthcoming paper, we compare our results with FEMA’s surge estimates and find strong agreement. We simply take a different path—one grounded solely in observational data, with transparent uncertainty quantification.
I am happy to answer any further questions you or your readers may have.
Take care,
D.J.
As a scientist, I call BS. As a taxpayer (assuming at least some of this nonsense had federal funding), I say Go DOGE!