Show simple item record

dc.contributor.advisorWang, Guanghui
dc.contributor.authorWu, Yuanwei
dc.date.accessioned2020-05-19T15:12:25Z
dc.date.available2020-05-19T15:12:25Z
dc.date.issued2019-12-31
dc.date.submitted2019
dc.identifier.otherhttp://dissertations.umi.com/ku:16905
dc.identifier.urihttp://hdl.handle.net/1808/30365
dc.description.abstractDeep learning (DL) has dramatically improved the state-of-the-art performances in broad applications of computer vision, such as image recognition, object detection, segmentation, and point cloud analysis. However, the reasons for such huge empirical success of DL still keep elusive theoretically. In this dissertation, to understand DL and improve its efficiency, robustness, and interpretability, we theoretically investigate optimization algorithms for training deep models and empirically explore deep learning based point cloud analysis and image classification. 1). Optimization for Training Deep Models: Neural network training is one of the most difficult optimization problems involved in DL. Recently, it has been attracting more and more attention to understand the global optimality in DL. However, we observe that conventional DL solvers have not been developed intentionally to seek for such global optimality. In this dissertation, we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. The proposed BPGrad algorithm is based on the assumption of Lipschitz continuity in DL, and as a result, it can adaptively determine the step size for the current gradient given the history of previous updates, wherein theoretically no smaller steps can achieve the global optimality. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation. 2). Deep Learning Based Point Cloud Analysis and Image Classification: The network architecture is of central importance for many visual recognition tasks. In this dissertation, we focus on the emerging field of point clouds analysis and image classification. 2.1) Point cloud analysis: We observe that traditional 6D pose estimation approaches are not sufficient to address the problem where neither a CAD model of the object nor the ground-truth 6D poses of its instances are available during training. We propose a novel unsupervised approach to jointly learn the 3D object model and estimate the 6D poses of multiple instances of the same object in a single end-to-end deep neural network framework, with applications to depth-based instance segmentation. The inputs are depth images, and the learned object model is represented by a 3D point cloud. Specifically, our network produces a 3D object model and a list of rigid transformations on this model to generate instances, which when rendered must match the observed point cloud to minimizing the Chamfer distance. To render the set of instance point clouds with occlusions, the network automatically removes the occluded points in a given camera view. Extensive experiments evaluate our technique on several object models and a varying number of instances in 3D point clouds. Compared with popular baselines for instance segmentation, our model not only demonstrates competitive performance, but also learns a 3D object model that is represented as a 3D point cloud. 2.2) Low quality image classification: We propose a simple while effective unsupervised deep feature transfer network to address the degrading problem of the state-of-the-art classification algorithms on low-quality images. No fine-tuning is required in our method. We use a pre-trained deep model to extract features for both high-resolution (HR) and low-resolution (LR) images, and feed them into a multilayer feature transfer network for knowledge transfer. An SVM classifier is learned directly using these transferred low-resolution features. Our network can be embedded into the state-of-the-art network models as a plug-in feature enhancement module. It preserves data structures in feature space for HR images, and transfers the distinguishing features from a well-structured source domain (HR features space) to a not well-organized target domain (LR features space). Extensive experiments show that the proposed transfer network achieves significant improvements over the baseline method.
dc.format.extent124 pages
dc.language.isoen
dc.publisherUniversity of Kansas
dc.rightsCopyright held by the author.
dc.subjectArtificial intelligence
dc.subjectComputer science
dc.subjectComputer Vision
dc.subjectDeep Learning
dc.subjectImage Classification
dc.subjectOptimization Algorithm
dc.subjectPoint Cloud Analysis
dc.titleOptimization for Training Deep Models and Deep Learning Based Point Cloud Analysis and Image Classification
dc.typeDissertation
dc.contributor.cmtememberKim, Taejoon
dc.contributor.cmtememberLuo, Bo
dc.contributor.cmtememberYun, Heechul
dc.contributor.cmtememberChao, Haiyang
dc.thesis.degreeDisciplineElectrical Engineering & Computer Science
dc.thesis.degreeLevelPh.D.
dc.identifier.orcidhttps://orcid.org/0000-0002-0385-8016
dc.rights.accessrightsopenAccess


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record