What are Hyperspectral images?
Hyperspectral images (HSIs) are a kind of optical remote sensing image with a high spectral resolution. Hyperspectral images (HSIs) have attracted much attention recently as they possess unique properties and contain massive information. The newly developed deep learning methods are applied successfully in HSI classification, achieving higher accuracy than traditional methods.
Earlier developed methods and the challenges
The earlier DL-based HSI classification methods were based on fully connected neural networks, such as stacked autoencoders (SAEs) and recursive autoencoders (RAEs). Therefore, they destroyed the spatial structure information of an HSI as they could only handle one-dimensional vectors.
This field’s challenging aspect is constructing a robust DL model with high performance and low computational cost based on imbalanced and small-sized data. This is because the available labeled pixels for HSI classification are limited. After all, they are challenging to collect and expensive to label. Also, the distribution of categories in the labeled data is imbalanced.
The convolutional neural networks (CNNs) were introduced to overcome this defect. 2D CNN-based architecture proposed, including R-VCANet and bayesian 2D CNN, was inefficient. An HSI has too many channels, which causes 2D convolution kernels to be too deep, and there is a remarkable increase in the number of parameters.
Therefore, a new 3D deep learning model was proposed for HSI classification, which processes the spectral and spatial information simultaneously while sustaining lower computation costs (i.e., floating-point operations, FLOPs). A 3D CNN needs to traverse in-depth and lacks a global view of spectral information. Thus it incurs more calculational consumption even though it reduces the number of parameters. Although the above mentioned DL models show as desired, they are not as suitable because they have higher computational cost and require many training samples and network parameters.
LiteDepthwiseNet is an extremely lightweight neural network proposed for HSI classification. It uses group convolution and has a 3D two-way dense structure, which reduces the number of parameters and computational costs. However, the group convolution may lead to loss of accuracy as it cuts off different channels’ connection.
Its main advantages over traditional 3D CNN architectures are summarized as follows.
1) In LiteDepthwiseNet involves a minimal number of parameters and FLOPs. In this new network architecture, a 3D depthwise convolution replaces the group convolution. The 3D depthwise convolution contains pointwise convolution that connects all hyperspectral channels. Thus the corresponding network architecture has a full-channel receptive field, making it more suitable for HSI classification.
2) The mainstream cross-entropy loss (CEL) is replaced by focal loss (FL) as the loss function. The optimum performance of the model is improved as it increases the value of small sample categories.
3) To enhance the model’s linearity and reduce the overfitting phenomenon caused by a limited number of training samples, the middle activation and normalization layers in the original 3D depthwise convolution network are stripped. By doing this, the number of parameters and computational consumption is also reduced.
A structural comparison between group convolution, 3D depthwise convolution and modified 3D depthwise convolution is as follows: