Regression coefficient estimation from remote sensing maps

Kerri Lu, Dan M. Kluger, Stephen Bates, Sherrie Wang

arXiv:2407.13659·stat.AP·Published 2024-07-18·Updated 2025-07-03

Regressions are commonly used in environmental science and economics to identify causal or associative relationships between variables. In these settings, remote sensing-derived map products increasingly serve as sources of variables, enabling estimation of effects such as the impact of conservation zones on deforestation. However, the quality of map products varies, and -- because maps are outputs of complex machine learning algorithms that take in a variety of remotely sensed variables as inputs -- errors are difficult to characterize. Thus, population-level estimators from such maps may be biased. In this paper, we apply prediction-powered inference (PPI) to estimate regression coefficients relating a response variable and covariates to each other. PPI is a method that estimates parameters of interest by using a small amount of randomly sampled ground truth data to correct for bias in large-scale remote sensing map products. Applying PPI across multiple remote sensing use cases in regression coefficient estimation, we find that it results in estimates that are (1) more reliable than using the map product as if it were 100% accurate and (2) have lower uncertainty than using only the ground truth sample data and ignoring the map product. Empirically, we observe effective sample size increases of up to 17-fold using PPI compared to only using ground truth data. This is the first work to estimate remote sensing regression coefficients without assumptions on the structure of map product errors. Data and code are available at https://github.com/Earth-Intelligence-Lab/uncertainty-quantification.

TopicsClimate, Weather & Geophysics

Tagsremote-sensing

arXiv categoriesstat.AP, econ.GN, eess.SP

arXiv abstract pagePDF