DEEP LEARNING FOR HYPERSPECTRAL IMAGE CLASSIFICATION

Ahmad, Muhammad

Hyperspectral Imaging (HSI) has been extensively utilized in many real-life applications because it benefits from the detailed spectral information contained in each pixel. Notably, the complex characteristics i.e., the nonlinear relation among the captured spectral information and the corresponding object of HSI data make accurate classification challenging for traditional methods. In the last few years, Deep Learning (DL) has been substantiated as a powerful feature extractor that effectively addresses the nonlinear problems that appeared in a number of computer vision tasks. This prompts the deployment of DL for HSI Classification (HSIC) which revealed good performance. Keeping in mind the aforementioned issues, this thesis first enlists a systematic overview of DL for HSIC and compared state-of-the-art strategies of the said topic. Primarily, this thesis encapsulates the main challenges of traditional machine learning for HSIC and then acquaint the superiority of DL to address these problems. The literature is breakdown the state-of-the-art DL frameworks into spectral features, spatial features, and together spatial-spectral features to systematically analyze the achievements and future directions. This thesis also investigates the behavior and performance in terms of computational cost and classification accuracy, of the most commonly and widely used classification algorithms under different experimental setups. In a nutshell, the following specific contributions are made in this thesis: 1. A Fast and Compact 3D CNN that utilizes both spatial-spectral features maps to improve the performance of HSIC. 2. 3D CNNs are computationally expensive and 2D CNN alone cannot efficiently extract discriminating spectral-spatial features. Therefore, to overcome these challenges, this part presents a compact hybrid CNN model which overcomes the aforementioned challenges by distributing spatial-spectral feature extraction across 3D and 2D layers. 3. CNN’s are known to be effective in exploiting joint spatial-spectral information with the expense of lower generalization performance and learning speed due to the hard labels and non-uniform distribution over labels. Several regularization techniques such as dropout, L1, L2, etc., have been used to overcome the aforesaid issues. However, sometimes models learn to predict the samples extremely confidently which is not good from a generalization point of view. Therefore, this thesis proposed an idea to enhance the generalization performance of a hybrid CNN for HSIC using soft labels that are a weighted average of the hard labels and uniform distribution over ground labels. The proposed method helps to prevent CNN from becoming over-confident. 4. DL usually required a large number of labeled training samples which is not a real scenario. Thus, a fully automatic Spatial-Spectral approach has been proposed for the selection of most informative and heterogeneous samples for training using a novel Spectral Angle Mapper (SAM) based objective function for the computation of attribute profiles in a computationally efficient fashion.