Weather Station Point Interpolation Analysis

Steven Aarons, Hunter College

INTRODUCTION

space

NWP and Traditional Weather Forecasting

The majority of weather forecasts are created using Numerical Weather Prediction (NWP) models. These models first assimilate multiple sources of data including: geostationary and polar-orbiting satellites, weather stations, ground-radar stations, radiosondes, and more (NOAA, 2023). This data is used to construct a model of the current conditions of the atmosphere and oceans. This is known as the initial state. Then, using equations based on physics the NWP model computes the fluid dynamics of the atmosphere one timestep in the future (Shuman, 1989).These equations are non-linear and therefore discretized across a 3D grid to derive useful results, meaning a partial derivative is solved at each grid box. This process is repeated until the desired forecast is created. NWP models require high computational power and use supercomputers to solve these equations.

Figure 1. Simplified representation of NWP data processing pipeline (NOAA 2023).

Figure 1. Simplified representation of NWP data processing pipeline (NOAA 2023).

The largest challenge with weather forecasting is maintaining high accuracy in weather prediction when forecasting far into the future. There are multiple factors that lead to decreased accuracy with increasing lead-time, or how far into the future the weather is forecasted. The physics-based equations can only account for atmospheric processes that occur at a large scale covering multiple or all gridboxes. These equations are also resolved explicitly, meaning the computer attempts to create a simpler equation that has a direct relationship between the dependent and independent variable. Based on the order of the time integration schemes that attempt to solve the equations, they are approximated to a degree to reduce computational constraint. These approximations are compounded each time a new timestep is computed.

However, the main problem lies in atmospheric processes that are either too complex to simply model with an equation, or occur at scales smaller than the grid spacing, meaning they cannot be properly discretized. Some examples of these processes include: radiative transfer schemes, convection, turbulence, cumulus cloud formation, cloud microphysics, and more (Arakawa, 2004). Instead, these processes and their effects are estimated in what is known as parameterization. Unfortunately, many of the processes that contribute to precipitation occur at the sub-grid scale and therefore make predicting precipitation a challenge even with all the advancements in NWPs. These factors along with the computational constraint that comes with attempting to increase the horizontal resolution (or decrease the grid spacing) leads to forecasts that cannot accurately model the weather with a lead time of over 7-10 days on average.

“A seven-day forecast can accurately predict the weather about 80 percent of the time and a five-day forecast can accurately predict the weather approximately 90 percent of the time. However, a 10-day…forecast is only right about half the time” (NOAA SciJinks, 2023).

Figure 2. Forecast Skill vs. Forecast Range (lead-time) for different types of forecasts (Lukas, 2020).

Figure 2. Forecast Skill vs. Forecast Range (lead-time) for different types of forecasts (Lukas, 2020).

Figure 2. shows the different types of forecast models and their skill (i.e. accuracy) vs. their lead time. Weather forecasts, which are the focus of this research, have very high skill but decrease very quickly with lead-time. This differs from climate forecasts which have lower skill but maintain skill for much longer lead times.

Previous work involved analyzing parameterization schemes for the Weather Research and Forecasting Advanced Research Model (WRF-ARW) in a case study. This focused on a weather forecasting failure, seen on March 2, 2018, when a major snow-storm unexpectedly brought inches of snow to the Hudson valley and North-East United States. The National Weather Service blended warm and cold operational forecasts and created a forecast that predicted the precipitation in the region as a blend of snow and rain. However, the predicted snow depth in the Albany area differed by approximately 8-12 inches. As a result of the storm five people lost their lives and over a million customers lost power across the United States (Wochit, 2018).

Deep Learning for Weather Forecasting

Deep learning (DL) algorithms derive a relationship between variables in a large database to predict a selected variable. DL does not need to understand the physical relationships that create atmospheric conditions and instead derives its own equations between variables. This non-numerical processing can lower the computational constraint, as opposed to computing physics partial-derivative equations, and can better estimate processes that would be parameterized by taking historical data into account. However, in the same vein, the DL model can output a forecast that does not adhere to the laws of physics such as conservation of mass or energy (Lockwood, et. al 2024).

As a part of the CUNY NSF REU in Remote Sensing, under the mentorship of Yanna Chen, from the University of Albany, a dual convolution neural network model (CNN) in the U-Net architecture was created to predict precipitation and compare the accuracy of those predictions with observational data. The first part of the dual model is a classification model which produces the probability of rain or no rain at each grid point (Badrinath, 2023). This probability is thresholded at 0.5, which is common with sigmoid activation functions in the last layer of a neural network. Binary array is constructed of predicted rain or no rain points. That binary mask is then applied to the second part of the dual model, a continuous regression U-net which attempts to predict the numerical amount of precipitation at that grid point. This masks any numbers the regression U-net predicts in classified “no-rain” grid points.

Figure 3. U-net Architecture

Searching for Ground Truth

Currently the model uses The European Center for Medium Range Weather Forecasting (ECMWF) ERA-5 reanalysis dataset for both the input features and labels, we wish to predict, such as total precipitation. In order to post-process NWP output to be more inline with observational data, the labels need to be of the same size and shape as the features, particularly for the U-net architecture see Figure 3., above.

As weather station data is collected for that particular location, it is point data. Weather forecasts cover the entire domain, or region, that they are trying to forecast in a grid or array format. The value depicted in the grid box may be the value for the center point of the gridbox or average values for the entire gridbox, depending on the NWP model and the grid model used. Regardless, the point data needs to match the array shape and size of the forecast we wish to post-process.

Interpolation is the process of estimating values at unsampled locations between given points. To determine the best method for converting weather station point data to raster data, we analyze a number of interpolation techniques through the Geostatistical Analyst extension in ArcGIS Pro. To gauge a model’s predictive power we use cross-validation, a leave-one-out resampling method that first interpolates the data with all points to set model parameters and then removes one point at a time to predict the value at that point’s location and calculate the error with the actual value. A visual representation of cross-validation is shown in Figure 4.

Figure 2. Forecast Skill vs. Forecast Range (lead-time) for different types of forecasts (Lukas, 2020).

Figure 4. A red point is hidden, and the value is predicted from the remaining points. This process is repeated for all points (ArcGIS Pro).

DATA & METHODOLOGY

space

MESONET MAP AND DEM

Figure 5. NYS Mesonet Weather Stations overlayed on to the NYS Digital Elevation Map (NYS Mesonet).

NYS Mesonet

The NYS Mesonet Weather Station Network consists of 127 weather stations, spaced at an average of 17 miles apart with at least one site in every county and borough (Mesonet). The locations of the weather stations are shown in the figure above. The data is available in 5-minute or hourly intervals, hourly data was selected for temporal alignment with NWP model output. NYS Mesonet was installed in 2014, and while neural networks typically require several decades of data, other precipitation datasets, including ASOS, HPD, and COOP, which date back to the 1950s, were found to be full of null values. This is expected as sensor quality has advanced over time, however, even more recent data, starting in 2009, was found to be mostly null. These other datasets also had the disadvantage of having fewer locations than NYS Mesonet.

Elevation Data

Certain interpolation methods, like Empirical Bayesian Kriging (EBK) Regression Prediction, require an explanatory raster. For this, the ⅓ arc second (approximately 10m resolution) Digital Elevation Map (DEM) from the National Map Download was acquired (The National Map download). The NYS State Boundary Shapefile from NYS GOV Civil boundaries was also procured (NYS GOV Civil).

Methodology

The Mesonet data was acquired in NetCDF format. An error reading the columns with the NetCDF points to Feature Class tool required conversion into .csv format through the Xarray Python package before importing to ArcGIS Pro. The data variables from the Mesonet file are shown in Table 1., below.

Table 1. NYS Mesonet Selected Variables
# Variables Units
1 Latitude, longitude, and elevation m
2 Maximum/minimum/average air temperature at 2 meters (6 feet) K
3 Maximum/minimum/average relative humidity at 2 meters (6 feet) percent
4 Maximum/minimum/average dewpoint at 2 meters (6 feet) K
5 Liquid-equivalent precipitation, maximum 1-hour accumulation mm
6 Average wind speed and direction, wind maximum, wind speed standard deviation, and wind direction standard deviation from the best available sensor at 10 meters (33 feet) m/s
7 Total incoming solar radiation at 2 meters (6 feet) MJ/m2
8 Maximum/minimum/average soil moisture at 5, 25, and 50 cm below the ground m3/m3

The data spans February of 2018. First, null values were selected and deleted leading to 119 weather station points per hour, from the original 127. These null values may be weather stations that have not been online as of 2018 or the result of sensor errors. A single hour, 02/02/2018 12:00AM, whose distribution of rain to no rain points approximately equal to the distribution of the month was selected for the interpolation analysis.

Co-Kriging interpolation allows for up to four additional variables to refine the model to better estimate the autocorrelation with the selected data field, precipitation, and cross-correlation between variables. To select the best variables to help predict precipitation, the exploratory regression tool was used. This tool evaluates all possible combinations of explanatory variables by creating OLS models to best explain the dependent variable. The summary of variable significance is shown below in Table 2. The percent positive or negative refers to the direction of the correlation.

Table 2. Exploratory Regression Summary
Variable % Significant % Negative % Positive
Maximum Temperature 2m 95.56 5.93 94.07
Minimum Dewpoint 89.55 48.51 51.49
Average Relative Humidity 89.33 7.33 92.67
Average Dewpoint 85.82 66.42 33.58
Maximum Dewpoint 85.19 84.44 15.56
Average Temperature 2m 83.58 29.85 70.15
Maximum Relative Relative Humidity 81.46 47.68 52.32
Minimum Temperature 2m 79.26 58.52 41.48
Minimum Relative Relative Humidity 77.33 8.67 91.33

The top three most significant variables were selected for Co-Kriging: maximum temperature 2m, minimum dewpoint, average relative humidity.

To compare different interpolation models with default and optimized parameters, the Exploratory Interpolation tool was run.

Interpolation

Between interpolation models, there are many factors to consider. Interpolation methods can either be classified as deterministic or geostatistical. Deterministic methods such as Inverse Distance Weighted (IDW) and Spline are directly based on surrounding measured values or on mathematical formulas that determine smoothness of the output surface. Geostatistical methods, such as Kriging, are statistical models that incorporate spatial autocorrelation, or the statistical relationships between measured points.

The general formula for IDW and Kriging is shown in Equation 1., below:

Placeholder Image

Formula 1. Where: Z(si) = the measured value at the ith location, λi = an unknown weight for the measured value at the ith location, s0 = the prediction location, N = the number of measured values

In IDW, the weight, λi, is determined solely by the distance from points to the prediction location.

In Kriging, the weights depend on the distance to the prediction location, a fitted model to the measured points, and the spatial relationships between measured values near the prediction location. Kriging estimates spatial autocorrelation through fitting a semivariogram and computing covariance functions. The empirical semivariogram function is shown below:

Equation 2. At points separated by distance, h, the semivariogram function calculates half the average of the squared differences between said points.

Equation 2. At points separated by distance, h, the semivariogram function calculates half the average of the squared differences between said points.

Empirical Bayesian Kriging differs from other kriging methods by taking into account a level of uncertainty or error introduced by estimating the underlying semivariogram. EBK automatically adjusts the parameters that define the semivariogram through subsetting and running multiple simulations.

EBK 3D takes into account the elevation data at each point. This allows for incorporation of the height difference between points, as well as the distance.

EBK Regression Prediction combines ordinary least squares regression with simple kriging using an explanatory raster, such as elevation, to attempt to produce more accurate results than either Kriging or regular EBK can provide. For all EBK interpolations, the K-Bessel semivariogram function is selected as it is the most robust but requires more computational power.

Radial Basis Functions include Spline functions and are exact interpolators that attempt to fit a smooth surface across known points. Their function depends solely on the distance from a center point.

Kernel interpolators use a kernel function to determine how the weights are influenced based on distance. The exponential kernel function is shown below, where r is the radius centered at a point, h is bandwidth:

Equation 3. Where r is the radius centered at a point, h is bandwidth:

The 5th order polynomial kernel function is shown below:

Equation 4. Where r is the radius centered at a point, h is bandwidth:

The following interpolation methods and their parameters were run in Geostatistical Wizard to determine if accuracy from the Exploratory Interpolation tool could be improved:

EBK

EBK 3D, with point elevation data.

EBK Regression Prediction, with elevation raster.

IDW

Radial Basis Functions - Tension Spline

Kernel Interpolation - Exponential

Kernel Interpolation - 5th order polynomial

DEM tiles from the National Map Download were downloaded using the U-get application. These tiles were then combined using the Mosaic to New Raster Tool. This new raster was then clipped by the NY State Boundary shapefile.

All data was projected to the UTM18N State Plane projection using the Project tool, prior to any analysis.

RESULTS

space

Figure 5. Explanatory Interpolation Methods ranked by Root Mean Square Error (RMSE) with Mean Error shown

Figure 5. Explanatory Interpolation Methods ranked by Root Mean Square Error (RMSE) with Mean Error shown.

The bar chart above, in Figure 5., shows the top ranked interpolation method from the Exploratory Interpolation tool. These methods were compared using a single criterion - highest precision (accuracy). The highest ranking method is optimized Simple Kriging with a RMSE of approximately 0.2995.

The following are the prediction surface maps produced through the selected interpolation methods in Geostatistical Wizard (GW). The accompanying scatterplot shows the measured versus predicted values in the cross-validation section of GW. The grey line represents a perfect 1:1 fit between measured and predicted values, where y=x. The blue line shows the best-fit line of the model’s fit through the scatterplot points, obtained through a regression from the predicted values against the measured values.

The bar chart in Figure 6., shows the output table of the Compare Geostatistical Layers tool, with a single criterion weighting, showing the highest prediction accuracy determined by RMSE, whose equation is shown below.

RMSE Equation Figure 6. Bar Chart of output from Compare Geostatistical Layers Tool with highest predcition accuracy single criterion comparision

Figure 6. Bar Chart of output from Compare Geostatistical Layers Tool with highest predcition accuracy single criterion comparision.

Discussion & Future Work

space

Figure 6., shows that the lowest RMSE of the surface prediction maps created by GW belongs to Co-Kriging with a RMSE of approximately 0.2905. This is lower than the lowest RMSE from the Explanatory Interpolation tool, though the difference is so slight that it may not be statistically significant. In the same vein, the few degree difference between all the layers in general may not necessarily indicate a better performing model. It is also important to know that RMSE gives more weight to larger errors as the residuals are squared, making it particularly sensitive to outliers, and may punish certain models based on their intrinsic properties.

To account for this limitation in comparing layers through a single criterion, RMSE, the Compare Geostatistical Layers tool is run again with a weighted average comparison method. The highest prediction accuracy, lowest bias, and lowest worst-case error were all weighted equally. The resulting table is shown below:

Layer Highest Accuracy Rank Lowest Bias Rank Lowest Worst-Case Error Rank Weighted Average Rank
EBK Regression Prediction 3 1 2 2
Empirical Bayesian Kriging 4 2 4 3.333333
CoKriging 1 9 1 3.666667
Inverse Distance Weighting 5 6 3 4.666667
Radial Basis Functions 6 4 5 5
Exp Interpolation - Highest Rank 2 5 9 5.333333
Empirical Bayesian Kriging 3D 7 3 8 6
Kernel Interpolation Exp 8 7 6 7
Kernel Interpolation 5th order 9 8 7 8

In this case, the top three highest ranked layers, in order from highest to lowest are: EBK Regression Prediction, followed by regular Empirical Bayesian Kriging, and finally CoKriging. Literature review and sources online, when starting this research, pointed to EBK Regression Prediction to be best equipped for interpolating weather station data, as weather variables are highly dependent on elevation. EBK Regression Prediction is able to capture large-scale trends, via regression, and local spatial patterns, via kriging, which explain why it was ranked the highest.

EBK Regression Prediction is able to capture a greater level of uncertainty, which is important when looking at the distribution of precipitation values. In NY state, precipitation occurs relatively infrequently which results in an extremely right skewed distribution of precipitation values. Deep learning requires a level of normality between data samples and incorporating the elevation raster may lead to more similar prediction maps when produced by EBK Regression prediction as opposed to Co-Kriging. For the case of interpolating weather station data into a grid/raster format for input as the “y” or ground truth variable in my deep learning model, because of the overall robustness of the model, EBK Regression Prediction is chosen as the process for interpolating NYS Mesonet weather station precipitation values.

Further research on interpolation may involve interpolating values through time as well as space, though little research and information is available for this research, it seems logical that interpolation models, particularly those that take autocorrelation into account would improve through fitting the model onto multiple hours of data and iteratively refining. The Space Time Mining Toolkit will be assessed for its applicability for this scenario.

Acknowledgements & References

space

Acknowledgements

The author would like to thank Professor Sun for his continued guidance throughout the course and input on this project.

This research is made possible by the New York State (NYS) Mesonet. Original funding for the NYS Mesonet (NYSM) buildup was provided by Federal Emergency Management Agency grant FEMA-4085-DR-NY. The continued operation and maintenance of the NYSM is supported by the National Mesonet Program, University at Albany, Federal and private grants, and others.

This research project was supported by NSF REU Grant AGS-2150432, NSF IUSE Grant ICER-2023174, and the Pinkerton Foundation (HIRES). The authors would like to express their gratitude to their faculty advisor Dr. Blake and mentor Yanna Chen for their guidance. The statements contained within the poster are not the opinions of the funding agency or the U.S. government but reflect the author’s opinions. Thank you to my mentor, Professor Yanna Chen from the University of Albany, for guiding me through this research.

References

Arakawa, A. (2004). The cumulus parameterization problem: Past, present, and future. Journal of Climate, 17, 2493–2525.

Badrinath, A., Delle Monache, L., Hayatbini, N., Chapman, W., Cannon, F., & Ralph, M. (2023). Improving precipitation forecasts with convolutional neural networks. Weather and Forecasting, 38(2), 291-306. https://doi.org/10.1175/WAF-D-22-0002.1

Brotzge, J. A., & Coauthors. (2020). A technical overview of the New York State Mesonet Standard Network. Journal of Atmospheric and Oceanic Technology, 37, 1827–1845. https://doi.org/10.1175/JTECH-D-19-0220.1

Esri. (2024). How Kriging works (3.3). ArcGIS Pro. https://pro.arcgis.com/en/pro-app/3.3/tool-reference/3d-analyst/how-kriging-works.htm

Esri. (2024). Performing cross-validation and validation (latest). ArcGIS Pro. https://pro.arcgis.com/en/pro-app/latest/help/analysis/geostatistical-analyst/performing-cross-validation-and-validation.htm

Lockwood, J. W., Loridan, T., Lin, N., Oppenheimer, M., & Hannah, N. (2024). A machine learning approach to model over-ocean tropical cyclone precipitation. Journal of Hydrometeorology, 25, 207–221. https://doi.org/10.1175/JHM-D-23-0065.1

Lukas, J. (2020). Figure 1: Schematic showing typical forecast skill relative to forecast range (or lead time), and main sources of predictability, for three types of weather and climate forecasts. In J. Lukas & K. Payton (Eds.), Forecast skill and predictability (modified from E. Gawthrop & T. Barnston). International Research Institute for Climate and Society. Retrieved from https://coloradoriverscience.org/Weather_and_climate_forecasts

National Oceanic and Atmospheric Administration. (2023, May 18). Upper air charts: Weather models. NOAA. Retrieved August 2, 2024, from https://www.noaa.gov/jetstream/upper-air-charts/weather-models

National Oceanic and Atmospheric Administration. (n.d.). Forecast reliability. SciJinks. Retrieved August 2, 2024, from https://scijinks.gov/forecast-reliability/

New York State GIS Clearinghouse. Civil boundaries. Retrieved December 22, 2024, from https://gis.ny.gov/civil-boundaries

Shuman, F. G. (1989). History of numerical weather prediction at the National Meteorological Center. Weather and Forecasting, 4, 286–296. https://doi.org/10.1175/1520-0434(1989)004<0286>2.0.CO;2

U.S. Geological Survey. (2023). The National Map: 1/3 arc-second digital elevation model (DEM) [Data set]. Retrieved December 17, 2024, from https://www.usgs.gov/the-national-map

Wochit. (2018, March 2). Weather bomb cyclone: What it is and how it affects you. CNN. Retrieved August 2, 2024, from https://www.cnn.com/2018/03/02/us/weather-bomb-cyclone/index.html