A Cloud Data Filling Approach Inspired By Recommender Systems For Satellite-Based Coastal Land Use Classification

This Article Is Based On The Research Paper 'A recommender system-inspired cloud data filling scheme for satellite-based coastal land use classification'. All Credit For This Research Goes To The Researchers 👏👏👏

Please Don't Forget To Join Our ML Subreddit

Satellite-based sensing technologies, which have been under development for decades, now provide the intriguing potential for continuous and wide-ranging observations of the earth’s surface. The Landsat missions, which have captured imagery of worldwide land conditions and dynamics every 16 days since July 1972 (the revisit time was 18 days when Landsat 4 launched in 1982), are an outstanding example. However, the data acquired has limited usefulness, particularly along the coast, where there is a lot of cloud cover. Cloud cover limits the spatial-temporal availability of earth observation, generates data gaps and noise, and makes earth surface analysis more difficult.

However, in applications, the current cloud data filling strategies are still insufficient. For starters, the computational load of machine learning-based algorithms is still high – intense training could use many resources. Second, rapid techniques, such as matrix decomposition algorithms, frequently have low accuracy, limiting their use in the field of complex landscapes. Third, most previous research has concentrated on open ocean applications and continuous numerical variables, with little experience with cloud cover in coastal areas where land and ocean collide to generate complex landscapes incorporating categorical data such as landscape type.

The current study discovered that a novel application of a recommender system algorithm might fill in the gaps in the cloud for coastal areas. Recommender systems and the algorithms that power them have fueled the growth of e-commerce sites like Alibaba, Flipkart, Netflix, and others.

Cloud-filling algorithms used in remote sensing measure continuous data in the open ocean, such as water temperature, color, and algae content, to anticipate what’s concealed. However, “errors are magnified at the shore due to higher cloud cover, vegetation, and other variables,” according to a researcher, who added that recommender systems “might do a better job in this objective.”

The research team constructed a cloud-filling model based on the work of Simon Funk, a software developer who won a Netflix recommendation tool competition, to verify his hypothesis. The Funk-SVD method plots consumer reviews on a matrix. This information is then used to forecast what people will watch if they don’t leave a review.

Cloud filling works similarly: each map coordinate is represented by a pixel on an image, which can be either water or land, with clouds representing unrecorded data. Based on other data sources, the researchers’s Funk-SVD adaption generates educated assumptions about what lies behind the clouds.

The researchers taught Funk-SVD to finish the cloud-filling strategy using a 258-frame image library taken from Landsat missions in the Delaware Bay. His technique outperformed the most extensively used cloud-filling tool, DINEOF (Data-Interpolating Empirical Orthogonal Functions), comparable to another popular machine-learning-powered application, Datawig. While Datawig requires a lot of processing power and can take days, researcher team’s method only took 30 seconds.


DINEOF recovered the lost data at low blocking rates, but the accuracy quickly decreased as the blocking rate increased. In contrast, Datawig could recover the data even at high blocking rates, although the recovered landscape was noisy. The novel implementation of the Funk-SVD approach performed admirably: the recovered landscape matches the ground truth (blocking rate of 0) and retains the natural landscape’s cohesion. The Funk-SVD method outperforms the deep learning method and significantly outperforms the standard DINEOF method. In Fig. 4a, the F1 scores of the different approaches using randomly picked ground facts are compared.

Furthermore, Funk-SVD could achieve good filling accuracy even with high blocking rates, whereas DINEOF depends on cloud cover patterns. The error bars depict the range of the five numerical experiments repeated five times. Although the F1 scores of the deep learning and Funk-SVD methods are similar, the Funk-SVD technique required significantly less computing time (Fig. 4b), which was 30 seconds – the deep learning method requires considerable training, which can take up to several days. The findings suggest that Funk-SVD is a new and dependable way for accurately and efficiently filling cloud data, with the potential to improve current practice.


Error Analysis

The error propagation shows that DINEOF can’t guarantee that the error in iterative operations will be reduced to the true value. The other two methods are similar in that they minimize the difference between the outcome and the original matrix and can restrict the growth of the error. Nonetheless, the current error propagation framework offers an opportunity and a tool for developing new algorithms to improve mode decomposition-based methods.

Exploring temporal and spatial information to inform data imputation is another possible area for improving existing methods. The image was rearranged into a vector, and the vectors were stacked into a matrix in this numerical experiment. The original matrix’s temporal-spatial relationship may be lost due to this re-arrangement. The nearby horizontal relationship between pixels remained intact. However, the vertical neighboring relationship was broken. This vertical link may still be recoverable in the latent space for some approaches, but it may be lost in deep learning methods that ignore pixel position. Furthermore, while the 16-day return interval may not capture small processes like tidal cycles, it can still record lengthy processes like seasonal and annual variations in earth surface dynamics. Such dynamic data could be used in emerging mode decomposition approaches like Dynamic Mode Decomposition, which has been proved helpful in coastal process analysis and has the potential to improve existing methods.


The research group claims that the above solution can be used for long-term Earth observation. For example, the approach could be used to map the rate of urbanization across huge areas or quantify crop productivity. It can also do it more quickly and for less money than traditional approaches. It was discovered that the accuracy of Funk-SVD is the best of the three evaluated methods, and the computational speed is comparable to DINEOF but significantly faster than Datawig. Datawig’s deep learning algorithm could achieve similar accuracy, but it would take two orders of magnitude longer to train than Funk-SVD. The Funk-SVD algorithm can also recover sparse datasets, which is a common scenario in recommender systems.

Source: https://techxplore.com/news/2022-05-algorithms-power-amazon-netflix-satellite.html

Paper: https://www.sciencedirect.com/science/article/pii/S0303243422000964?via%3Dihub