Process flow for improving SMAP sea surface salinity (SSS) based on machine learning approaches ··· 16 Figure 2.5. a)-(i)) Model-by-model results for calibration and validation, and (j) comparison between in situ measurements and the SMAP SSS product ··· 21 Figure 2.7. Scatter plots and spatial distribution of in-situ data versus (a,c,e) SMAP SSS product and (b,d,f) HYCOM SSS product ··· 45 Figure 3.4. a) Spatial distribution of GBRT-based SSS.
Introduction
Importance of the Sea Surface Salinity
Remote Sensing of SSS
- L-band microwave sensor-derived SSS
- Limitations of L-band microwave sensor-derived SSS
In the case of other wind speed issues, higher wind speed roughens the ocean surface, which reduces the accuracy of the correction (Park et al., 2015;. There is a large difference in bias and errors of the SSS products for each ocean. (Qin et al., 2020).
Goals of this dissertation
Existing global L-band microwave sensor-derived SSS correction studies used simple empirical equations (i.e., linear regression) (Kolodziejczyk et al., 2016; Qin et al., 2020) and replaced the input data with other data obtained from satellite or model data used in the salinity retrieval algorithm (Dinnat et al., 2019; González-Gambau et al., 2017; . Sharma et al., 2019). Although SMOS (available since 2010) is suitable for long-term analysis based on a large amount of data, SMAP provides more reliable data about LSC error than SMOS, especially along coastal regions (Le Vine et al. , 2019).
Improvement of SMAP Sea Surface Salinity in river-dominated oceans using machine
Introduction
Data
- Study area
- In-situ data
- SMAP satellite and ancillary data
- HYCOM SSS
It affects tropical cyclone intensity and biological productivity in the BoB region (Akhil et al., 2020). A detailed description of the CAP/JPL SSS retrieval algorithm is given in Fore et al.
Methodology
- Improvement of SMAP SSS
- Machine learning approaches
Second, it shows whether the contribution of each variable is positive or negative (Mangalathu et al., 2020). Second, SVR is a widely used machine learning technique to estimate environmental factors in the field of remote sensing (Mountrakis et al., 2011).
Results and Discussions
- Accuracy of SMAP SSS in the study area
- Improvement of SMAP SSS
- Variable importance for improvement of SMAP SSS
- Spatial and temporal distribution of SSS
- Novelty and limitation
It appears that the low salinity water is not perfectly learned due to the relatively small proportion (3%) of the low salinity samples (<= 30 psu) in the whole data. a)-(i)) The modeling results per model for calibration and validation, and (j) the comparison between in situ measurements and the SMAP SSS product. The accuracy increase was particularly significant in the EA region where the accuracy of SMAP SSS was low in seawater with high salinity (Figure 2.7(a)-(d)). In the EA region, there were no available in-situ data of low salinity (i.e., <30 psu) consistent with satellite data (i.e., SMAP SSS).
For the GoM region, outliers that are overestimated or underestimated in the SMAP SSS product have been corrected (Figure 2.7(i), (k) and (l)). SWH and wind speed may have played a role in the SSS correction model in this study as major factors of SMAP SSS product uncertainty. Thus, the interaction between SMAP SSS and HYCOM SSS in the model was quite dependent on the target SSS values.
Low salinity seawater is always present in the northeastern BoB due to the discharge of the Ganges and Irrawaddy rivers (Figure 2.12(b)). Note that seawater with low salinity was observed in the East China Sea in the summer of 2018. The accuracy of the SSS improved by the proposed models increased compared to the existing SMAP SSS product (also better than the HYCOM SSS product), especially in the EA region (Figures 2.6, 2.7 and 2.8).
To further improve SMAP SSS in low-salinity seawater, global low-salinity seawater samples due to melting sea ice should be included in the training of machine learning models.
Conclusions
Introduction
Data
- In-situ data
- SMAP satellite and ancillary data
Spatial distribution of in-situ SSS at each 25 km grid cell from April 2015 to December 2020. This is based on the CAP retrieval algorithm developed with Aquarius as a target and now extended to SMAP (Fore et al., 2016) . A detailed description of the CAP/JPL algorithm for SMAP is given in Fore et al.
L2B data, newly released as version 5 on December 11, 2020, is distributed as a 25 km slice with global coverage of approximately three days. A description of the auxiliary data used in the SMAP SSS retrieval algorithm and HYCOM SSS can be found in Chapters 2.2.3 and 2.2.4. The NOAA WaveWatch Ⅲ SWH used in the previous study (Jang et al., 2021) was not used in this study.
We used the Global Precipitation Measurement Mission (GPM) version 6 Level 3 Integrated Multi-satellite Retrievals for Late Daily GPM (IMERG) product (GPM_3IMERGDL) at 10 km X 10 km spatial resolution and daily temporal resolution as the data of precipitation. We used the land sea mask provided by the Tropical Climate Measurement Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA) version 2 with 25 km X 25 km spatial resolution.
Methodology
- Correction of SMAP SSS
- Machine learning approaches
The KNN algorithm has also shown successful performance in classification and regression problems in remote sensing (Meng et al., 2007; Alimjan et al., 2018). A main idea of KNN is the similarity based on the distance between pre-trained data in the feature space (Zhang et al., 2017). Each decision tree is constructed from multiple nodes that repeatedly divide the input data to predict the target value (Jang et al., 2017).
RF constructs numerous independent trees using two randomized parameters: a random subset of training samples and input parameters at each split node (Altman and Krzywinski, 2017; He et al., 2021). The boosting technique constructs trees by weighting to the part where the error is significant, unlike the bagging technique where each tree is independent (Park et al., 2021). Gradient boosting is a gradient descent algorithm that reduces the gradient of the loss function by changing the variables in the model to minimize a cost function (Callens et al., 2020; He et al., 2021).
Subsequent new trees are trained by fitting the model to the error of the previous model using gradient reduction of residuals (Chen et al., 2021). LGBT uses a leaf strategy that continuously divides around the nodes with the largest loss, instead of the planar strategy used in GBRT and XGBoost (Su et al., 2021).
Comparison of the SMAP SSS with in-situ data
The four tree-based machine learning models were implemented in Python with the "RandomForestRegressor", "GradientBoostingRegressor". On the other hand, the accuracy of SMAP SSS increases in equatorial regions (Tropical Ocean; TRO, Eastern Equatorial Pacific; EEP, NIO, and ARP) with relatively high SST. Since the accuracy of SMAP and HYCOM SSS varies depending on the environment, SMAP SSS can be improved more accurately by combining the advantages of the two products based on empirical learning using machine learning approaches.
Scatterplots and spatial distribution of in-situ data versus (a,c,e) SMAP SSS product and (b,d,f) HYCOM SSS product. Root mean square error (RMSE) and mean bias (MB) of SMAP and HYCOM SSS versus in-situ data in each ocean.
Results and Discussions
- Correction of SMAP SSS
- Variable importance for correction of SMAP SSS
- Novelty and limitation
The GBRT-based SSS performed better than the SMAP SSS in all regions, but was slightly less accurate than the HYCOM SSS in the NOP and ARO regions. The low SMAP SSS value appears to have yielded a lower value for the GBRT-based SSS than the insitu and HYCOM SSS data. HYCOM SSS, which has higher accuracy in high salinity samples of 98% of the total data, contributed more to the model than SMAP SSS.
Parameters related to SMAP (i.e. SMAP SSS and Tbs for V and H polarization) were considered strongly negative. SMAP SSS has different relationships when SST is low (polar regions) and high, while HYCOM SSS shows no significant relationship with SST. This means that the SMAP SSS correction model can be determined by other input parameters such as HYCOM SSS and OISST.
Using SMAP SSS and HYCOM SSS together, the GBRT-based model outperforms the two SSS products. We have identified cases where SMAP SSS and HYCOM SSS uncertainty was significant (Table 3.2) and how input parameters were applied to the model through feature importance (Figure 3.6) and SHAP values (Figure 3.7).
Conclusions
Although it is challenging to generalize relationships between input parameters that vary from region to region, the accuracy of GBRT-based SSS was higher than that of SMAP in all regions (also better than the HYCOM SSS; Figures 3.4, 3.5 and Table 3.4) . However, in this study, in-situ data were not available for the polar regions (above 60°N and below 60°S) and the Yellow Sea and the Okhotsk Sea with severe RFI and LSC (Figure 3.1), so the model was not trained about these regions. There is potential for further improvement of the model through in-situ data addition.
Especially in the polar regions, as Tb contamination can occur due to sea ice (Fournier et al., 2019), additional sea ice can be used together. HYCOM SSS gives significantly low values in the Arctic region above 80°N, and improved HYCOM SSS is needed for a more accurate correction model.
Spatial and temporal interpolation of Sea Surface Salinity using deep learning
- Introduction
- Data
- In-situ data
- GBRT-based SSS
- Methodology
- Results and Discussion
- Interpolation of SSS
- Spatial distribution of the interpolated SSS
- Novelty and limitation
- Conclusions and Future work
The accuracy of the final interpolated SSS increased slightly over the primary interpolated SSS when the model was re-applied. Since in-situ measurements of the polar region above 60°N were not used in the correction model (Figure 3.1), the accuracy of the polar region may be low. When the model was applied to the primary interpolated SSS again, the MB and RMSE values of the final interpolated SSS increased.
However, the second application of the FCN model smoothed the SSS distribution in the final interpolated SSS. Daily charts of the along-track GBRT-based SSS, the SMAP 8-day SSS, and primary and final interpolated SSS in this study for July 14, 2018. Daily charts of the along-track GBRT-based SSS, the SMAP 8-day SSS, and primary and final interpolated SSS in this study for 24 September 2018 in the Amazon River plume region.
Daily maps of the along-track GBRT-based SSS, the SMAP 8-day SSS and primary and final interpolated SSS in this study for October 14, 2015 in the Bay of Bengal. An FCN-based interpolation model was proposed using GBRT-based SSS of the previous seven days.
Overall conclusions and Future research
Estimation of sea surface salinity in the northern Gulf of Mexico from satellite measurements of sea color. Comparison of spaceborne measurements of sea surface salinity and colored detrital matter in the Amazon plume. Evaluation and comparison of SMOS, Aquarius and SMAP sea surface salinity products in the Arctic Ocean.
Hourly estimation of sea surface salinity in the East China Sea using geostationary measurements of ocean color imagery. Assimilation of SMOS sea surface salinity in a regional ocean model for the South China Sea. Improvement of SMOS sea surface salinity in the western Mediterranean Sea using multivariate and multifractal analysis.
Estimation of sea surface salinity by remote sensing based on GOCI measurements in the southern Yellow Sea. Retrieval of remotely sensed sea surface salinity using MODIS data in the Bohai Sea of China.