Satellite Greenspace Mapping and Scale Effects on Environmental Exposure Assessment

(1)

Available online 4 November 2023

from greenspace and population mapping

Shengbiao Wu

^a^,¹

, Wenbo Yu

^a^,¹

, Jiafu An

^b

, Chen Lin

^c

, Bin Chen

^a^,^d^,^e^,^*

aFuture Urbanity & Sustainable Environment (FUSE) Lab, Division of Landscape Architecture, Department of Architecture, Faculty of Architecture, The University of Hong Kong, Hong Kong Special Administrative Region of China

bDepartment of Finance and Insurance, Faculty of Business, Lingnan University, Hong Kong Special Administrative Region of China

cFaculty of Business and Economics, The University of Hong Kong, Hong Kong Special Administrative Region of China

dUrban Systems Institute, The University of Hong Kong, Hong Kong Special Administrative Region of China

eMusketeers Foundation Institute of Data Science, The University of Hong Kong, Hong Kong Special Administrative Region of China

A R T I C L E I N F O Keywords:

Satellite greenspace mapping Population-weighted exposure Greenspace exposure inequality Landscape configuration Scale effect

A B S T R A C T

Satellite observations are increasingly used to characterize greenspace coverage, exposure, and equality assessment for environmental and health studies. Given the difference in spatial resolutions (namely, scale effect), different satellite datasets capture distinct levels of landscape details in urban green environments.

However, existing studies on measuring scale effects are limited to the greenness mapping in a few sampled cities regardless of the scale effects from population mapping and the associated controls from greenspace landscape configurations. To close this knowledge gap, we conducted a comprehensive inventory of the scale effects, using widely used satellite-based greenness (i.e., 10-m Sentinel-2, 30-m Landsat-8, and 500-m MODIS) and population (i.e., 30-m HRPD, 100-m WorldPop, and 1-km GPW) mapping datasets over 679 major cities (urban area >50 km²) in the United States. Results show that (1) compared with high-resolution Sentinel-2, Landsat-8 and MODIS overestimate greenspace coverage and human exposure but underestimate the inequality of human exposure to greenspace; (2) the differences in greenspace coverage and exposure across satellite sensors are linearly correlated with the greenspace provision magnitude; (3) landscape configuration explains the greenspace coverage differences across different satellite sensors. Aggregated and fragmented landscape metrics correlate positively and negatively with greenspace coverage differences, respectively; and (4) the spatial resolution of greenspace mapping shows a decreasing control while population data has tiny impacts on the inequality measurement of human exposure to greenspace. These findings answer how varying-scale satellite datasets cause a discrepancy in the measurement of greenspace coverage, human exposure, and inequality assessment. We advocate that researchers should select appropriate satellite-based greenness datasets by accounting for trade-offs between specific research benefits and costs to better position future greenspace-related environment and health outcome studies.

1. Introduction

Urban greenspace, including open, undeveloped land with natural vegetation such as parks, street trees, and urban forests, is reserved as a basic green infrastructure component in cities and provides multiple biophysical, aesthetic, and socioeconomic benefits. On the one hand, the existence of greenspace in the urban area offers vital ecosystem services to the natural environment through different pathways, including

improving air quality (García et al., 2019; James et al,2015), reducing noise pollution (Dadvand et al., 2015; Vivanco-Hidalgo et al., 2019), and protecting against heatwaves and extreme weather events (Arshad et al., 2020; Khalaim et al. 2021). On the other hand, exposure to the greenspace is beneficial to residents’ physical and mental health (García et al., 2021) by promoting physical activities (Pereira et al. 2013;

Persson et al. 2018), increasing recreational entertainment (Boulton et al., 2018; Liu et al., 2020), reducing stress (Ribeiro et al., 2021;

* Corresponding author at: Future Urbanity & Sustainable Environment (FUSE) Lab, Division of Landscape Architecture, Department of Architecture, Faculty of Architecture, The University of Hong Kong, Hong Kong Special Administrative Region of China.

E-mail address: [email protected] (B. Chen).

1 These authors contributed equally to this work.

https://doi.org/10.1016/j.ufug.2023.128136

Received 15 June 2023; Received in revised form 30 September 2023; Accepted 1 November 2023

(2)

Sugiyama et al., 2016), and stimulating societal cohesion (Kruize et al., 2019; Mouratidis and Poortinga, 2020). Given the recognized greenspace benefits, the United Nations has specified the need of providing universal access to green space for urban residents in the 11th Sustain- able Development Goal of Cities and Communities (Colglazier, 2015).

Accordingly, many quantitative studies have been conducted to capture greenspace supply and demand and measure availability, accessibility, and visibility (Chen et al., 2022a; Huang and Xu. 2022), with the aim of gauging progress toward achieving sustainable development goals. Here we categorized existing studies into three major categories by characterizing urban greenspace in terms of physical coverage, human exposure, and exposure equality. The very first cate- gory of urban greenspace studies is to quantify physical greenspace coverage by capturing the distribution and amount of urban greenspace.

Common practices have adopted many quantitative metrics (e.g., vegetation cover, canopy cover, park area, and tree density) to measure different forms of greenspace provision (Bauwelinck et al., 2020;

Mueller et al., 2021; Zhang et al., 2021). Nevertheless, as rapid urban- ization significantly alters the distribution and pattern of urban greenspace, people’s access to green infrastructure is changing over time, which arouses the research interest in measuring the level of how residents enjoy the urban greenspace (i.e., greenspace exposure). Green- space exposure describes the probability of human that might be exposed to green environments, which is usually measured by subjective (e.g., self-reported frequency or duration of greenspace usage) or objective (e.g., quantity-based availability, distance-based accessibility, quality-based attractiveness, and streetscape-based visibility) metrics (Chen et al., 2022a; Song et al., 2021; Zhang et al., 2022). It can be regarded as an indicator of both human-greenspace interaction and the supply-demand relationship (Chen et al. 2022a). Furthermore, increasing attention has been paid to the equality in human exposure to greenspace, which investigates whether different social demographics of people have an equal chance to access greenspace resources (Gradinaru et al., 2023; Nghiem et al., 2021). Economic inequality metrics such as the Gini index are used to evaluate the dispersion of greenspace setting and the inequality of greenspace accessibility (Chen et al., 2022a;

Nghiem et al., 2021; Song et al., 2021).

By virtue of large-scale coverage, real-time surveillance and struc- tured data, remote sensing technique, especially for satellite observations, has become the mainstream approach for Earth’s terrestrial

mapping and monitoring. Regarding the estimation metrics, the Normalized Differenced Vegetation Index (NDVI), which captures the chlorophyll content of vegetation, is a well-known satellite-based measure of physical greenspace coverage (Cunha et al., 2021; Helbich et al., 2021). The percentage of tree or canopy derived from land cover classification data is another alternative but bears varying accuracies from different classification products (Li et al., 2015; Zhou et al., 2022).

Additionally, spectral-unmixing derived vegetation fraction is increasingly dedicated to measuring greenspace coverage as it can explore sub-pixel information (Chen et al., 2022a; Yin and Yang, 2017).

Multi-scale remote sensing imageries are used in the realm of greenspace mapping. Satellite data with medium or coarse spatial resolutions, like Moderate Resolution Imaging Spectroradiometer (MODIS) with a 500-m resolution and Landsat with a 30-m resolution, record long-term surface changes over decades and have been widely used for greenspace mapping and analysis from sampled cities to the globe (Wellmann et al.

2020; Zhang et al. 2022). In recent years, high-resolution satellite data, such as National Agriculture Imagery Program (NAIP) data with a 1-m resolution and Systeme Probatoire d

′

Observation de la Terre (SPOT-5) with a 5-m resolution, is emerging and becoming popular in greenspace inventory because it can capture more details of small-patch greenspace that are invisible in coarser resolution data (Huang and Xu, 2020; Zhou et al., 2018). However, this type of data is limitedly available to the public over a large scale, and researchers thus have to resort to the usage of medium- or coarse-resolution remotely sensed imagery for the cross-city or global-scale greenspace mapping (Czekajlo et al., 2020;

Helbich, 2019). Furthermore, although medium-resolution satellites cannot capture explicit surface details, it has a broader view that sees a higher level of landscape configuration that might compensate for the loss in greenspace details. The net impacts of satellite resolution configurations on greenspace mapping win an increasing attention in the urban community.

Some previous efforts investigated the scale effect of satellite imagery on greenspace coverage, exposure, and related health benefits. Reid et al. (2018) revealed that the greenspace exposure derived from different satellite resolutions is similarly correlated with health outcomes. In contrast, using the lacunarity analysis approach based on fractal theory, Labib et al., (2020) showed that three common satellite greenspace metrics (i.e., NDVI, leaf area index, and land use/land cover) are scale sensitive and have varying association levels with human Fig. 1. Flowchart of the study design includes three major steps: (a) generating greenspace maps from different satellite data, (b) assessing multi-scale greenspace coverage estimations, and (c) exploring multi-scale greenspace and population mapping impacts on greenspace exposure and inequality assessment.

(3)

health in the Greater Manchester Metropolitan County. Zhou et al.

(2018) conducted one comparison study of nine major cities in China and concluded that only fine spatial resolution satellite imagery could observe highly fragmented and heterogeneous greenspace within the city. Jimenez et al. (2022) recently found greenspace coverage and exposure level are overestimated in the Greater Boston Area when using coarser resolution satellite NDVI. These explorations enhance our un- derstanding of satellite-derived greenspace exposure estimation at the individual city or a few sampled cities, while leaving the direction, magnitude, and drivers of greenspace difference across cities largely unknown. Besides, the spatial distribution of population is another important component when measuring human exposure to greenspace and the associated inequality assessment (Chen et al., 2022a, 2022b;

Song et al., 2021). The spatial resolution of current population dataset has a large range from meters to kilometers, such as the 30-m resolution Facebook High-Resolution Population Density (HRPD) map and 1-km resolution Gridded Population of the World (GPW) data (CIESIN et al., 2018). Very few studies have compared these different population datasets, causing uncertainty in greenspace exposure and inequality assessment unclear. Additionally, the spatial configurations of greenspace, in terms of greenspace provision and greenspace arrangement, have been reported to impact human exposure level to greenspace (Chen et al., 2022a; Li et al., 2013), but its explicit controls across different spatial resolutions remain unclear.

Set against the above background, this study will quantify the uncertainty of multi-scale satellite-derived greenspace and gridded population datasets in estimating greenspace exposure and equality across cities. We aim to address the following two major research questions: (1) To what extent are the differences in greenspace coverage measured by satellite datasets with different spatial resolutions? And what is the control role of landscape configuration on these differences? (2) To what extent are the differences in greenspace exposure, and inequality assessment when using multi-scale greenspace and population mappings in terms of individual and combined scaling effects?

2. Materials and methodology

We proposed to investigate the differences in greenspace coverage, exposure, and inequality from multi-scale greenspace and population mappings (Fig. 1). To this end, we first generated fractional greenspace coverage from the maximum composited maps of NDVI and surface spectral reflectance (i.e., blue, green, red, and near infrared) from multi- resolution satellite datasets. We then assessed the city-level difference among multi-scale greenspace assessments and explored the relationship between greenspace difference and landscape configuration metrics. We finally investigated the multi-scale greenspace and population impacts on greenspace exposure and inequality assessment. (Fig. 1).

2.1. Study area

We chose 679 major cities in the United States as the study area. The target cities were extracted from the latest Global Urban Boundaries (GUB) datasets in 2018 (Li et al., 2020) with an urban area greater than 50 km². Consequently, a total of 679 urban areas were selected for this study (Fig. 2).

2.2. Datasets

We used three types of satellite-based remote sensing data in 2020 to obtain multi-scale greenspace coverage maps, including Sentinel-2, Landsat-8, and MODIS surface reflectance datasets with a spatial resolution of 10 m, 30 m, and 500 m, respectively. Sentinel-2 is arousing an increasing number of usages due to its sun-synchronous, high resolution, 5-day revisit under the same viewing angle, multi-spectral imaging with 13 bands in the visible, near-infrared, and shortwave infrared part of the spectrum (Drusch et al., 2012). Landsat is characterized with the longest-running, sun-synchronous, moderate-resolution, 16-day revisit cycle, providing reflectance data of nine visible bands, including eight 30-m resolution spectral bands and one 15-m resolution panchromatic Fig. 2.The study area includes 679 major cities with an urban area greater than 50 km²in the contiguous United States. The state border layer is extracted from the United States Census TIGER/Line Shapefiles in 2012.

(4)

band (Wulder et al., 2019). MODIS provides earth observations every 1–2 days with a 2330-km-wide viewing swath in 36 discrete spectral bands despite its coarse spatial resolution, helping tackle long-term large-scale global dynamic problems (Justice et al., 2002). These three surface reflectance products are generated from the top-of-atmosphere (TOA) radiance after the radiometric calibration and atmospheric correction and verified by ground measurements with high accuracy (Louis et al., 2019; Vermote et al., 2016; Wang et al., 2010).

Three gridded population distribution datasets were used to explore the scaling effect of population mappings. Facebook High-Resolution Population Density (HRPD) maps offer the estimates of population density distribution from satellite imagery at a resolution of 30 m (https://dataforgood.facebook.com). WorldPop (www.worldpop.org) provides the estimated number of people residing in each 100 m×100 m grid based on a random forest model and a global database of the administrative unit-based census (Stevens et al., 2015).

The Gridded Population of the World (GPW) collection models the distribution of human population in terms of counts and densities on a continuous global raster surface with a resolution of 1000 m (CIESIN, 2018). All these three population distribution datasets in 2020 were used.

2.3. Generation of greenspace coverage maps

Accurate greenspace coverage mapping is the first step in the study of modeling human-greenspace interactions. The annual maximum greenspace composition approach is used to map greenspace coverage, which has two advantages compared with single-time-based greenspace baseline in previous efforts (Jimenez et al., 2022; Labib et al., 2020).

First, this approach can minimize cloud cover impact on greenspace estimation, especially for cities in or near the tropics suffering from frequent cloudy weather (Li et al., 2022). Second, this approach can reduce shadow impacts from 3-D structure of vegetation and building on greenspace mapping from high-resolution satellite data. We used 10-m-resolution Sentinel-2 (https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR), 30-m-resolution Landsat-8 (https://developers.google.com/earth-engine/datasets/catalo-

g/LANDSAT_LC08_C02_T1_L2), and 500-m-resolution MODIS reflectance data (https://developers.google.com/earth-engine/datasets/cata log/MODIS_061_MOD09GA) in 2020 summer (i.e., June to September) to generate the multi-scale greenspace maps on the Google Earth Engine (GEE) platform (Gorelick et al., 2017). To preprocess the images, we performed cloud/shadow and water body screening. Specifically, we Fig. 3.Spectral signatures of (a) vegetation, (b) bare soil, and (c) water body endmembers used in this study for the linear spectral unmixing (LSU) model. The normalized difference vegetation index (NDVI), normalized difference water index (NDWI), and four visible band spectral reflectance of blue, green, red, and near infrared (NIR) are used as spectral features.

Table 1

Landscape metrics for characterizing greenspace landscape configuration in this study.

Landscape metric Definition Formula Description

Aggregation index

(AI) Degree of patch aggregation AI= gii

max→gii

gii=Number of like adjacencies between pixels of patch type i based on the double count method.

max→gii=Maximum number of like adjacencies between pixels of patch type i based on the single-count method

Patch cohesion index (COHESION)

Physical connectedness of

patch type in landscape COHESION=

⎡

⎢⎣1−

∑_m

i=1

∑_n

j=1pij

∑_m

i=1∑_n

j=1pij ̅̅̅̅̅

aij

√

⎤

⎥⎥

⎥⎦ [

1− 1̅̅̅̅

√A ]−1

m =Number of patch types present in the landscape.

n =Total number of patches in the landscape of patch type i.

pij=Perimeter of patch ij in terms of number of cell surfaces.

α_ij=Area( m²)

of patch ij.

A=Total number of cells in the landscape.

Largest patch index

(LPI) Percentage of landscape

comprised by the largest patch LPI=max( aij)_n

j=1

A

α_ij=Area( m²)

of patch ij.

Mean perimeter-area

ratio (MPAR) Ratio of patch perimeter length to total landscape area PA=

∑_n

j=1pij/aij

ni

pij=Perimeter of patch ij in terms of number of cell surfaces.

α_ij=Area( m²)

of patch ij.

ni=Total number of patches in the landscape of patch type i.

Edge density (ED) Ratio of edge lengths for all segments to total landscape area

ED=E

A E=Total length of edge in landscape in terms of number of cell surfaces;

includes landscape boundary and background edge segments representing true edge.

Landscape shape

index (LSI) Shape complexity of the whole

landscape LSI= E′

2 ̅̅̅̅̅̅

πA

√ E′=Total length of edge in landscape in terms of number of cell surfaces;

includes entire landscape boundary and background edge segments regardless of whether they represent true edge.

Number of patches

(NP) Number of patches in the

landscape NP=ni ni=Total number of patches in the landscape of patch type i.

Patch density (PD) Ratio of patch number to total

landscape area PD=ni

A ni=Total number of patches in the landscape of patch type i.

(5)

first conducted a pixel-based quality check for each satellite imagery to remove poor-quality data contaminated by cloud, cloud shadow, dilated cloud, and cirrus, using the default cloud mask and quality assessment band information. To eliminate the impacts of non-land covers, we removed the water body (e.g., rivers and lakes) within city areas using the 90-m rasterized global surface water data generated from the OpenStreetMap data (Yamazaki et al., 2019).

For each kind of satellite data, we calculated the NDVI values (NDVI

=(RNIR - RRed)/(RNIR +RRed)) using the spectral near-infrared reflectance (RNIR) and red reflectance (RRed), and generated the greenspace NDVI maps by compositing the maximum values of the NDVI time series from June to September. We also recorded the corresponding surface reflectance of blue, green, red, and near-infrared spectral bands, and the normalized difference water index (NDWI) as spectral features. We produced the greenspace coverage maps separately by leveraging the linear spectral unmixing (LSU) model for each satellite image collection (Keshava, 2003). The LSU model, as a subpixel analytical method, can tackle mixed pixel issues by decomposing the spectral signature of the

target pixel into a set of fractions using pure endmembers and their corresponding abundances. Assuming that the spectral feature of each pixel can be calculated as a linear combination of a group of spectrally pure constituents, the LSU model can be formulated as Eq. (1).

Ri=

∑ⁿ

k=1

fik•Cik+εi,

subject to∑ⁿ

k=1

fik=1,0≤fik≤1

(1)

where Ri denotes the spectra signals (i.e., NDVI, NDWI, blue, green, red, and near-infrared reflectance) of the ith pixel, fik represents the fraction of endmember k in the ith pixel, Cik is the spectra signal of endmember k in the ith pixel, εi is the residual term of the ith pixel, and n denotes the total endmember number.

We selected vegetation, bare soil, and water body as three endmembers (n=3) and adopted NDVI, NDWI, and four visible band reflectance as endmember features. We picked up pure pixels of each Fig. 4.Histograms of (a) Sentinel-2 derived greenspace coverage, (b) greenspace coverage difference of Landsat-8 over Sentinel-2, and (c) greenspace coverage difference of MODIS over Sentinel-2. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

Fig. 5. Two city examples of multi-scale satellite-derived greenspace coverages using the linear spectral unmixing (LSU) approach, including (a, e) 1-m National Agriculture Imagery Program (NAIP) image, (b, f) 10-m Sentinel-2, (c, g) 30-m Landsat-8, and (d, h) 500- m MODIS derived greenspace coverages. (a-d) are the results of Conway, Arkansas. (e-h) are the results of Victorville, California. The city-level greenspace coverage is shown in the sub-plot title. NAIP data is available from Google Earth Engine platform. Urban boundary is extracted from the Global Urban Boundaries (GUB) datasets in 2018.

(6)

endmember by drawing regions of interest (ROIs) from satellite image collections with distinct criteria: (1) vegetation endmember should have an NDVI value larger than 0.8; (2) bare soil endmember should have an NDVI value less than 0.2; and (3) water body endmember should have an NDWI value larger than 0. We then collected the mean spectral signatures of pure endmember pixels which have very similar patterns across satellites (Fig. 3). With these spectra, we finally solved Eq. (1) and calculated the fraction of vegetation endmember as sub-pixel greenspace coverage.

2.4. Assessment of multi-scale greenspace coverage difference

In order to investigate the impact of satellite imagery with different spatial resolutions on city-level greenspace coverage assessment, we quantified the difference among multi-scale greenspace estimation maps. We regarded the finest-resolution Sentinel-2 greenspace coverage map as the baseline, and measured the greenspace difference by sub- tracting the Sentinel-2 greenspace map from the greenspace map

generated from the other two satellites (i.e., Landsat and MODIS). To further investigate the relationship between greenspace difference and the baseline greenspace, we adopted the bin-based analysis approach to enhance data stability and reduce overfitting (Cao et al., 2020). Spe- cifically, we divided the baseline greenspace into 10 equal intervals and calculated the median value of greenspace difference and greenspace within each interval as dependent and independent variables, respectively. With the variable pairs, we conducted the linear regression analysis to attribute the controlling factors for muti-scale greenspace coverage difference.

To answer the question of how landscape configuration impacts the difference in greenspace coverage assessments, we used a suite of landscape metrics (Table 1) to quantify greenspace landscape configuration and explored its relationship with greenspace coverage estimated by different remote sensing datasets. Specifically, we used a threshold of 0.2, following previous studies (Huang and Xu, 2022; Spadoni et al., 2020), to classify the NDVI map into two types: greenspace (i.e., NDVI ≥ 0.2) and non-greenspace (i.e., NDVI <0.2) and assigned them with two Fig. 6.Association between greenspace difference and the Sentinel-2 baseline greenspace, including correlation between Sentinel-2 greenspace coverage and absolute coverage difference of (a) Landsat-8 over Sentinel-2, (b) MODIS over Sentinel-2, correlation between Sentinel-2 greenspace coverage and relative coverage difference of (c) Landsat-8 over Sentinel-2, (d) MODIS over Sentinel-2. The red open circles are bin-averaged coverage differences across Sentinel-2 greenspace coverage bins with a 10th percentile interval. The dashed red line represents the linear regression of the binned average values, with regression statistics shown with red fonts. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

(7)

(caption on next page)

(8)

categorical values: one and zero. Based on this categorical component map, we calculated landscape metrics using the open source ‘land- scapemetrics’ package on the R platform (Hesselbarth et al., 2019).

Similarly, we used the bin-based analysis approach to reduce the effects of minor observation errors.

2.5. Multi-scale impacts of greenspace and population mapping on greenspace exposure and inequality assessment

2.5.1. Greenspace exposure

The greenspace exposure metric models the spatial interactions between human and greenspace. We adopted the population-weighted exposure framework (Chen et al., 2022a; 2022b) to calculate the greenspace exposure of varying buffer sizes within the city using Eq. (2) as follows.

GE^b=

∑_M

i=1Pi×G^b_i

∑_M

i=1Pi

(2) where Pi denotes the population of the ith grid, G^b_i represents the greenspace coverage of the ith grid with a buffer size of b, M is the total number of grids within the city.

2.5.2. Greenspace exposure inequality

We employed the widely-used Gini index to measure the inequality of greenspace exposure, which is calculated from the Lorenz curve of cumulative greenspace exposure among individuals (Chen et al., 2022a;

Song et al., 2021) Gini^b=1−

∑^N

i=1

∑_i−₁

j=1g^b_j+∑^N

i=1

∑_i

j=1g^b_j

N×∑^N

j=1

g^b_j

(3)

where N is the total population within the city, g^b_jis the magnitude of jth resident’s greenspace exposure at the varying buffer size of b. The Gini index ranges from 0 (absolute equality) to 1 (absolute inequality), with a lower value indicating more even greenspace exposure and vice versa.

2.5.3. Impact from multi-scale greenspace coverage mappings on greenspace exposure and inequality assessment

Greenspace coverage maps with different spatial resolutions can influence the measurement of greenspace exposure and the associated inequality assessment. To investigate the impacts of multi-scale satellite- derived greenspace coverages, we adopted the 30-m resolution HRPD data with the finest resolution to characterize the population density distribution, and assessed the differences in greenspace exposure and inequality derived from three satellite-derived greenspace coverage maps (i.e., 10-m Sentinel-2, 30-m Landsat-8, and 500-m MODIS). We upscaled Sentinel-2 data and downscaled MODIS data to 30-m resolution using a nearest neighboring sampling approach to make population density and greenspace coverage comparable in the greenspace- population interaction modeling. Besides, buffer size is another critical parameter in greenspace characterization (Su et al., 2019). To explore the sensitivity of buffer size, we incorporated two buffer settings of 100 m and 500 m as the nearby greenspace environments.

2.5.4. Impact of multi-scale population mappings on greenspace exposure and inequality assessment

To further investigate how population distribution affects the

assessment of greenspace exposure and inequality, we fixed the greenspace coverage and made a comparison among multi-scale population density datasets. Specifically, we utilized Sentinel-2 derived greenspace coverage, which has the finest resolution in our study, as the default dataset for greenspace quantitation, and compared the assessments of greenspace exposure and inequality derived from three population datasets (i.e., 30-m HRPD, 100-m WorldPop, and 1-km GPW). In order to ensure consistent spatial resolutions, we downscaled HRPD, WorldPop, and GPW to 10-m resolution. Similarly, we applied the abovementioned two settings of greenspace buffer sizes to verify the consistency of findings.

2.5.5. Combined impact of multi-scale greenspace coverage and population mappings on greenspace exposure and inequality assessment

To quantify the combined impacts of greenspace coverage and population maps on greenspace exposure and inequality assessment, we integrated three satellite-derived greenspace coverage maps (i.e., 10-m Sentinel-2, 30-m Landsat-8, and 500-m MODIS) and three population datasets (i.e., 30-m HRPD, 100-m WorldPop, and 1-km GPW), which results in nine population-greenspace groups. For each group, we calculated the mean and standard deviation metrics of greenspace exposure and Gini index across all sampled cities.

3. Results

3.1. Comparison of greenspace coverage estimated from multi-resolution satellite datasets

With respect to the first research question, we examined the impact of multi-resolution satellite datasets on the quantitation of greenspace coverage. We calculated the frequency distribution of greenspace coverage from Sentinel-2 and its difference with Landsat-8 and MODIS (Fig. 4). Results show that the greenspace coverage extracted from the baseline Sentinel-2 ranges from 0.1 to 0.9, with most values distributed in the range of 0.7–0.8 (Fig. 4a). Meanwhile, we observe a left-skewed distribution in the greenspace coverage difference of Landsat-8 over Sentinel-2, suggesting overestimates using Landsat-8 imagery (Fig. 4b).

Compared with Landsat-8, with larger greenspace coverages, the difference of MODIS over Sentinel-2 in greenspace faction is larger (Fig. 4c). The reason that emerged from this evidence indicates that fine- grained high-resolution data source, i.e., Sentinel-2 in our experiment, can detect more details of greenspace, exhibiting the advantages of enabling more precise greenspace characterization, which is clearly shown by the results of two city examples (Fig. 5). The spatial distribution maps of city-level Sentinel-2-derived greenspace coverage and its differences over Landsat-8 and MODIS can be found in Fig. S1.

We further conducted closer inspections on the relationship between greenspace difference and the baseline greenspace (i.e., Sentinel-2 derived greenspace mappings) using a bin-based analysis approach. In particular, the baseline greenspace is divided into 10 equal (i.e., 10th percentile) intervals, based on which the median value of fraction difference can be calculated within each interval. As shown in Fig. 6, the median values are marked as red open circles. Strong positive relationship is observed for greenspace differences in terms of absolute and relative magnitudes across different satellite sensors against the baseline greenspace.

It is reasonable to assume that the natural difference between different satellite sensors is that they can capture different levels of landscape details. Under such an assumption, we revisited the first Fig. 7. Positive correlation between greenspace coverage difference and aggregation landscape metrics, including (a, e) aggregation index, (b, f) largest patch index, (c, g) mean perimeter-area ratio, and (d, h) patch cohesion index. Panels (a-d) are the results of greenspace coverage difference between Landsat-8 and Sentinel-2.

Panels (e-h) are the results of greenspace coverage difference between MODIS and Sentinel-2. The red open circles are bin-averaged greenspace coverage differences across Sentinel-2 greenspace coverage bins with a 10th percentile interval. The dashed red line represents the linear regression of the binned average values, with regression statistics shown with red fonts. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

(9)

(caption on next page)

(10)

research question and dived into exploring the relationship between greenspace landscape configuration and greenspace coverage measured by satellite-based datasets with different spatial resolutions.

As shown in Fig. 7, greenspace coverage difference is positively correlated with the landscape metrics, including aggregation index, patch cohesion index, largest patch index, and mean perimeter-area ratio. These metrics are used to quantify landscape aggregation.

Hence, it is obvious that coarse-resolution satellite tends to overestimate greenspace for the aggregated landscape. Further study exhibits negative associations between greenspace coverage difference and landscape metrics on fragmentation, including edge density, landscape shape index, number of patches, and patch density (Fig. 8).

3.2. Individual effects of greenspace and population mappings 3.2.1. Impact of greenspace coverage

To investigate the impact of multi-scale greenspace mappings on greenspace exposure and inequality assessment, we first calculated the greenspace exposure and greenspace exposure inequality using greenspace coverage maps derived from three satellite datasets and the HRPD population dataset across two different buffer sizes of 100 m and 500 m, according to Eqs. (2–3). As shown in Fig. 9 and Fig. S2, the Sentinel-2 derived greenspace exposure fluctuates between 0.1 and 0.9, with most values ranging from 0.7 to 0.8. Both the Landsat-8 and MODIS- based estimates of greenspace exposure tend to be larger than the Sentinel-2-derived ones in most cities. Results show that the frequency

distribution of greenspace exposure and the associated exposure differences using the 100-m buffer size are similar to those derived from using the 500-m buffer size, suggesting the consistency of the findings.

As before, we explored the relationship between greenspace exposure differences and the baseline Sentinel-2-derived greenspace coverage. As shown in Fig. 10, there exists a positive correlation between greenspace exposure difference and the Sentinel-2 greenspace coverage, and the patterns for different buffer sizes are similar (Fig. 10a-b and Fig. 10c-d).

As for greenspace exposure inequality, we calculated the Gini index of greenspace exposure and plotted the spatial map and histogram of the Gini index and Gini differences (Fig. 11 and Fig. S3). The greenspace exposure inequality shows a very different pattern against greenspace coverage or greenspace exposure. The frequency distribution of Sentinel-2 derived greenspace exposure inequality exhibits the right- skewed pattern, with most values lower than 0.2, showing that human beings in most cities are experiencing relatively equal greenspace exposure. Under the circumstance of larger buffer sizes, the distribution of Sentinel-2 greenspace inequality has an apparent left shift. The greenspace inequality differences of Landsat-8 over Sentinel-2 (Fig. 11b and Fig. 11e) and MODIS over Sentinel-2 (Fig. 11c and Fig. 11f) can be up to 0.1. Clear evidence shows that Landsat-8 and MODIS tend to result in underestimates of greenspace exposure inequality.

As we can see from Fig. 12, the linear regression between the estimated inequality difference of Landsat-8 over Sentinel-2 and the Sentinel-2 derived greenspace coverage is flat (Figs. 12a and 12c), which indicates that the inequality difference is relatively stable Fig. 8. Negative correlation between greenspace coverage difference and fragmentation landscape metrics, including (a, e) edge density, (b, f) landscape shape index, (c, g) number of patches, and (d, h) patch density. Panels (a-d) are the results of greenspace coverage difference between Landsat-8 and Sentinel-2. Panels (e- h) are the results of greenspace coverage difference between MODIS and Sentinel-2. The red open circles are bin-averaged greenspace coverage differences across Sentinel-2 greenspace coverage bins with a 10th percentile interval. The dashed red line represents the linear regression of the binned average values, with regression statistics shown with red fonts. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

Fig. 9. Histograms of 100-m greenspace buffer size of (a) Sentinel-2 derived greenspace exposure, (b) greenspace exposure difference of Landsat-8 over Sentinel-2, (c) greenspace exposure difference of MODIS over Sentinel-2, and 500-m greenspace buffer size of (d) Sentinel-2 derived greenspace exposure, (e) greenspace exposure difference of Landsat-8 over Sentinel-2, (f) greenspace exposure difference of MODIS over Sentinel-2. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

(11)

regardless of the greenspace coverages. However, the MODIS-Sentinel-2 linear regression model has a negative slope, suggesting that the inequality difference of MODIS over Sentinel-2 tends to decrease with the increase of greenspace coverage (Figs. 12b and 12d). Specifically, MODIS-derived estimates of greenspace exposure inequality are more serious for lower greenspace fractional areas. Similar findings can be found from both experimental analytics using 100-m and 500-m buffer sizes.

3.2.2. Impact of population distribution

As shown in Fig. 13, the impact of multi-scale population mappings on greenspace exposure assessment is slight, with a maximum difference of 0.02 between WorldPop and GPW-derived greenspace exposure and the HRPD-derived one. In the meantime, both experimental analytics using 100-m and 500-m buffer sizes demonstrate the same pattern.

Given the fact that the population distribution datasets with varying spatial resolutions make slight differences to the greenspace exposure measurement, we can foresee the limited impact of different gridded population datasets on exposure inequality as well, thus we excluded the

experiments regarding the impact from multi-scale population mappings on the greenspace exposure inequality.

3.3. Combined effects of greenspace coverage and population distribution To investigate the combined effects of multi-scale greenness and population mappings, we quantified the greenspace exposure and inequality by combing three greenspace datasets (i.e., Sentinel-2, Landsat-8, and MODIS) and three population datasets (i.e., HRPD, WorldPop, and GPW) using a 500-m buffer size as nearby greenspace environments. Specifically, we calculated the average value of all selected United States cities and computed the standard deviation.

Table 2 shows the statistical results of greenspace exposure using multi- scale greenspace and population mapping data. It is clear that United States cities enjoy relatively high greenspace exposure with an average value of 0.73. The mean and standard deviation of greenspace exposure increases with the spatial resolution of satellite sensors, while the multi- scale population data has little impact on the greenspace exposure.

Table 3 summarizes the results of greenspace exposure inequality using Fig. 10. Association between greenspace exposure difference and the baseline greenspace coverage, including correlation between Sentinel-2 greenspace coverage and greenspace exposure difference of (a) Landsat-8 over Sentinel-2 across 100-m buffer size, (b) MODIS over Sentinel-2 across 100-m buffer size, (c) Landsat-8 over Sentinel-2 across 500-m buffer size, and (d) MODIS over Sentinel-2 across 500-m buffer size. The red open circles are bin-averaged greenspace exposure differences across Sentinel-2 greenspace coverage bins with a 10th percentile interval. The dashed red line represents the linear regression of the binned average values, with regression statistics shown with red fonts. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

(12)

multi-scale greenspace and population data. Population data of varying spatial resolutions make little difference to the greenspace exposure inequality, while multi-resolution satellite imagery has much influence on the measurement of greenspace inequality.

4. Discussion

Previous research efforts found the spatial resolution of satellite data significantly impacts the urban greenspace mapping and human exposure quantification in single or few cities (Browning and Locke, 2020;

Jimenez et al., 2022; Labib et al., 2020; Reid et al., 2018; Zhou et al., 2018), but to what extent such effects change across cities and the un- derlying mechanisms remain unclear. This study leverages the GEE cloud-computing platform (Gorelick et al., 2017) and the population-weighted environment exposure framework (Chen et al.

2018; Song et al. 2018) to investigate the scaling effect from greenspace and population mappings on greenspace exposure and inequality assessments over 679 major United States cities.

4.1. Landscape configuration regulations on cross-scale satellite greenspace

Compared with Sentinel-2, Landsat-8 and MODIS underestimate greenspace coverage in the low green-cover city, and overestimate greenspace coverage in the high green-cover city. Over the sparse and fragmental greenspace cover, coarse-resolution satellite cannot see tiny patches of greenspace and underestimate greenspace coverage (Zhou et al., 2018). By contrast, over the dense and aggregated greenspace cover, coarse-resolution satellites smooth the negative impacts from 3-D complex vegetation structure (e.g., shadow cast by complex forest structures) and overestimate greenspace coverage (Zeng et al., 2023).

The net impact of satellite spatial resolution on greenspace estimation is controlled by these two contrasting processes that are closely related to

greenspace landscape in the city (Labib et al., 2020). In the aggregated landscape, the estimated greenspace coverage differences between coarse-resolution and high-resolution satellite increase with satellite resolution (Fig. 7), suggesting the unseen proportion of non-greenspace component is over that of greenspace component when increasing the spatial resolution of the satellite. By contrast, the greenspace coverage difference negatively correlates with satellite resolution in the fragmented landscape because coarse-resolution satellites hide small patches of urban greenspace. This mechanism explains previous con- tradictory findings about the impacts of spatial resolution on satellite-derived greenspace coverage (Jimenez et al., 2022; Zhou et al., 2018).

Broadly, landscape configuration regulations on cross-scale satellite greenspace have two following implications. First, greenspace landscape is a key determinant of satellite-observed greenspace. When interpreting greenspace differences across satellites, greenspace landscape might provide detailed cues regarding the spatial distribution of greenspace within a city. Second, as spatial resolution inconsistently impacts greenspace coverage extraction over aggregated and fragmented land- scapes, the greenspace-related heat mitigation capability and health benefit from cross-scale satellites should be carefully interpreted (Li et al. 2013). Nevertheless, further investigations on the associations between cross-scale greenspace and environmental or health metrics are required.

4.2. Dominant impact of greenspace mapping on greenspace exposure and inequality assessment

The population-weighted exposure model shows a very similar histogram distribution of greenspace exposure, like the greenspace coverage (Fig. 4 and Fig. 9), suggesting the major control of greenspace distribution when considering interactions between greenspace and human. Similar to greenspace coverage, greenspace exposure shows Fig. 11. Histograms of 100-m greenspace buffer size of (a) Sentinel-2 derived greenspace exposure inequality, (b) greenspace exposure inequality difference of Landsat-8 over Sentinel-2, (c) greenspace exposure inequality difference of MODIS over Sentinel-2, and 500-m greenspace buffer size of (d) Sentinel-2 derived greenspace exposure inequality, (e) greenspace exposure inequality difference of Landsat-8 over Sentinel-2, (f) greenspace inequality difference of MODIS over Sentinel-2. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

(13)

comparable patterns across different levels of satellite spatial resolution (Fig. 9) and the associations between greenspace exposure difference and baseline greenspace coverage (Fig. 10). It thus can further infer that greenspace landscape configuration regulates cross-scale satellite-based greenspace exposure assessment. Additionally, the control magnitude of greenspace landscape on greenspace exposure is larger than that of greenspace coverage (Fig. 6 and Fig. 10). The included nearby green environment using different buffer sizes shows a slight impact on the greenspace exposure assessment across different satellites, which is consistent with one recent study that spatiotemporal patterns of greenspace exposure are similar for different buffer sizes (Chen et al., 2022a). Despite the varying overestimation and underestimation over individual cities, coarse-resolution satellite overestimates greenspace exposure at the national scale (Table 2). Compared with MODIS, Landsat shows a closer greenspace exposure level than Sentinel-2. The relatively larger standard deviation of city-level greenspace exposure in MODIS suggests different magnitudes and directions of spatial resolution impacts across different cities.

With the widely used Gini index, results show that the inequality

assessment of human exposure to greenspace can be impacted by satellite resolutions (Fig. 11) as greenspace provision and spatial configuration vary across satellites. The negative associations between Gini difference and the baseline greenspace coverage (Fig. 12) imply that improving greenspace provision effectively reduces the disparity in human greenspace exposure. As satellite resolution increases, the estimated Gini index is reduced at the national scale due to the overestimated greenspace coverage or exposure in coarser-resolution satellites (Table 3). This finding suggests that the use of coarse- resolution satellite greenspace maps might artificially smoothen the disparity in greenspace exposure.

4.3. Limited impact of population data on greenspace exposure and inequality assessment

Contrary to expectations, the greenspace exposure and the associated inequality are insensitive to population distribution under different levels of spatial resolution (Table 2 and Table 3). The population- weighted model framework and the similar spatial distributions across Fig. 12. Association between greenspace exposure inequality difference and the baseline greenspace coverage, including correlation between Sentinel-2 greenspace coverage and greenspace exposure inequality difference of (a) Landsat-8 over Sentinel-2 across 100-m buffer size, (b) MODIS over Sentinel-2 across 100-m buffer size, (c) Landsat-8 over Sentinel-2 across 500-m buffer size, and (d) MODIS over Sentinel-2 across 500-m buffer size. The red open circles are bin-averaged greenspace exposure inequality difference across Sentinel-2 greenspace coverage bins with a 10th percentile interval. The dashed red line represents the linear regression of the binned average values, with regression statistics shown with red fonts. L8 and S2 denote Landsat-8 and Sentinel-2, respectively.

(14)

different population datasets can explain the unexpected results. Popu- lation data is used as weights in the population-weighted exposure framework to model the interactions between population and greenspace. The spatial distribution of population density is thus more important than the absolute magnitude in quantifying greenspace exposure. More importantly, the majority of gridded population datasets are estimated from the same source census or population counts using different top-down disaggregating strategies and physical constraints (Leyk et al., 2019; Thomson et al., 2022). Thus, different gridded population datasets have similar spatial patterns (Fig. 14). This reason explains the larger sensitivity of population data in measuring area-based environment exposure, such as flooding exposure (Smith et al., 2019), than population-weighted greenspace exposure.

4.4. Optimal satellite resolution for greenspace study

The optimal satellite spatial resolution choice for the greenspace- related environmental exposure study should follow the specific research purpose. For instance, if the study aims to explore fine-scale greenspace benefits, such as the neighborhood-level perceived and mental health that are highly associated with the greenery of window views or small greenspace patches, high-resolution satellite data is preferable (Li and Sullivan, 2016). If the purpose of the study is the regional and global greenspace mapping or its environmental benefits, as well as the long-term spatiotemporal dynamics (Xu et al., 2022), medium or coarse-resolution satellite data shall still be a good choice.

Data availability is another factor that should be considered. Most of the high-resolution satellite data (e.g., 0.5-m WorldView-2 and 3-m Plan- etScope imagery) is commercial or only covers limited spatial regions (e.

g., sampled cities) in a short time period. Medium or coarse-resolution satellite data is usually publicly accessible and has global coverage for relatively longer time spans. Therefore, the selection of appropriate satellite dataset for greenspace-related studies should fully account for trade-offs between specific research benefits and costs covering data price, transferring instrument, storage requirement, and usage platform.

4.5. Limitations and future research

This study has several limitations that should be advanced in future work. First, we assumed the data uncertainty of different satellite products is the same. Namely, the accuracy and uncertainty of cloud detection, atmospheric correction, and radiometric calibration among Sentinel-2, Landsat-8, and MODIS are comparable (Su et al., 2019) so that we can interpret the discrepancy of greenspace mapping across satellites from the perspective of scale effects. Second, vegetation sea- sonality phenology significantly impacts the greenspace exposure and inequality assessment, especially for the city covered with deciduous Fig. 13. The histogram of 100-m greenspace buffer size of (a) HRPD derived greenspace exposure, (b) greenspace exposure difference of WorldPop over HRPD, (c) greenspace exposure difference of GPW over HRPD, and 500-m greenspace buffer size of (d) HRPD derived greenspace exposure, (e) greenspace exposure difference of WorldPop over HRPD, and (f) greenspace exposure difference of GPW over HRPD.

Table 2

Statistic summary of greenspace exposure using multi-scale greenspace and population mappings.

Greenspace/Population Sentinel-2 Landsat-8 MODIS

HRPD 0.692±0.112 0.721±0.129 0.766±0.159

WorldPop 0.691±0.111 0.722±0.127 0.764±0.158

GPW 0.693±0.112 0.729±0.123 0.762±0.159

Table 3

Statistic summary of greenspace exposure inequality measured by Gini using multi-scale greenspace and population mappings.

Greenspace/Population Sentinel-2 Landsat-8 MODIS

HRPD 0.076±0.029 0.076±0.058 0.065±0.038

WorldPop 0.073±0.026 0.072±0.052 0.064±0.037

GPW 0.073±0.021 0.065±0.024 0.063±0.038

(15)

vegetation (Chen et al., 2022a). The maximum composited greenspace coverage map cannot consider this seasonal impact. Moreover, the 10-m resolution Sentinel-2 satellite might not capture small greenspace patches, which needs further investigation by using higher spatiotemporal resolution satellite data (e.g., daily 3-m resolution PlanetScope) to map greenspace coverage. Third, people move around in their daily lives, and thus population distribution is dynamic. The gridded population density data used in this study is static and cannot measure the spatiotemporal variability of population. The population mobility data (e.g., Tencent location-based services data) is suggested to be used to assess dynamic human exposure to green environments (Song et al., 2021). Finally, the associations between greenspace exposure and health outcomes are reported to change with spatial scales (Browning and Locke, 2020), but have not been analyzed. Besides the green natural environment contrasts over different geographic contexts, the complex structures in population distribution, involving individual mobility pattern and sociodemographic disparity, could the cross-scale interactions between greenspace and population (Bell et al., 2015; Liu et al., 2023). This critical issue needs further exploration by integrating multi-source remote sensing, geospatial big data, and greenspace-population interaction models.

5. Conclusions

Satellite remote sensing has become the most feasible approach for regional and large-scale greenspace mapping but suffers from scale effects of greenness and population mappings. To figure out the difference in mapping multi-scale greenspace coverage, exposure, and inequality, we first compared the greenspace coverage derived from three different scale satellite imagery (i.e., 10-m Sentinel-2, 30-m Landsat-8, and 500-m MODIS), and then calculated the greenspace exposure and inequality coupling three population datasets of varying spatial resolutions (i.e., 30-m HRPD, 100-m WorldPop, and 1-km GPW). Results show that the greenspace differences across satellite sensors are closely correlated

with greenspace magnitude, while the spatial resolution of population data has tiny impacts on the estimation of greenspace exposure and inequality. We further explored the relationship between greenspace difference and landscape metrics to decipher the impact of landscape configuration on the difference. It is evident that coarse-resolution satellite tends to overestimate greenspace coverage with the aggregated greenspace while it tends to underestimate the patchy greenspace. Our study brings forth how different-scale datasets result in a discrepancy in the measurement of greenspace coverage, exposure, and inequality, and provides a comprehensive guidance for selecting appropriate datasets to quantify greenspace-related issues.

CRediT authorship contribution statement

Shengbiao Wu: Data curation, Methodology, Software, Formal analysis, Writing – original draft, Writing – review & editing. Wenbo Yu: Writing - original draft, Writing - review & editing. Jiafu An:

Writing – review & editing. Chen Lin: Supervision, Writing – review &

editing. Bin Chen: Conceptualization, Supervision, Formal analysis, Writing – original draft, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This study is supported by National Key Research and Development Program of China (2022YFB3903703), the Research Grants Council of Hong Kong Early Career Scheme (HKU27600222) and General Research Fund (HKU17601423), National Natural Science Foundation of China Young Scientists Fund (42201373), NSFC/RGC Joint Research Scheme Fig. 14.Two city examples of multi-scale population density datasets, including (a, d) 30-m HRPD, (b, e) 100-m WorldPop, and (c, f) 100-m GPW. (a-c) are the results of Conway, Arkansas. (d-f) are the results of Victorville, California. Urban boundary is extracted from the Global Urban Boundaries (GUB) datasets in 2018.