Loading...
Thumbnail Image
Publication

Object Detection and Classification based on Hierarchical Semantic Features and Deep Neural Networks

Ma, Wenchi
Citations
Altmetric:
Abstract
The abilities of feature learning, semantic understanding, cognitive reasoning, and model generalization are the consistent pursuit for current deep learning-based computer vision tasks. A variety of network structures and algorithms have been proposed to learn effective features, extract contextual and semantic information, deduct the relationships between objects and scenes, and achieve robust and generalized model.Nevertheless, these challenges are still not well addressed. One issue lies in the inefficient feature learning and propagation, static single-dimension semantic memorizing, leading to the difficulty of handling challenging situations, such as small objects, occlusion, illumination, etc. The other issue is the robustness and generalization, especially when the data source has diversified feature distribution. The study aims to explore classification and detection models based on hierarchical semantic features ("transverse semantic" and "longitudinal semantic"), network architectures, and regularization algorithm, so that the above issues could be improved or solved. (1) A detector model is proposed to make full use of "transverse semantic", the semantic information in space scene, which emphasizes on the effectiveness of deep features produced in high-level layers for better detection of small and occluded objects. (2) We also explore the anchor-based detector algorithm and propose the location-aware reasoning (LAAR), where both the location and classification confidences is considered for the bounding box quality criterion, so that the bestqualified boxes can be picked up in Non-Maximum Suppression (NMS). (3) A semantic clustering-based deduction learning is proposed, which explores the "longitudinal semantic", realizing the high-level clustering in the semantic space, enabling the model to deduce the relations among various classes so as better classification performance is expected. (4) We propose the near-orthogonality regularization by introducing an implicit self-regularization to push the mean and variance of filter angles in a network towards 90â—¦ and 0â—¦ simultaneously, revealing it helps stabilize the training process, speed up convergence and improve robustness. (5) Inspired by the research that self attention networks possess a strong inductive bias which leads to the loss of feature expression power, the transformer architecture with mitigatory attention mechanism is proposed and applied with the state-of-the-art detectors, verifying the superiority of detection enhancement.
Description
Date
2021-12-31
Journal Title
Journal ISSN
Volume Title
Publisher
University of Kansas
Collections
Research Projects
Organizational Units
Journal Issue
Keywords
Computer science,
Citation
DOI
Embedded videos