Optimization for Training Deep Models and Deep Learning Based Point Cloud Analysis and Image Classification

Wu, Yuanwei

dc.contributor.advisor	Wang, Guanghui
dc.contributor.author	Wu, Yuanwei
dc.date.accessioned	2020-05-19T15:12:25Z
dc.date.available	2020-05-19T15:12:25Z
dc.date.issued	2019-12-31
dc.date.submitted	2019
dc.identifier.other	http://dissertations.umi.com/ku:16905
dc.identifier.uri	http://hdl.handle.net/1808/30365
dc.description.abstract	Deep learning (DL) has dramatically improved the state-of-the-art performances in broad applications of computer vision, such as image recognition, object detection, segmentation, and point cloud analysis. However, the reasons for such huge empirical success of DL still keep elusive theoretically. In this dissertation, to understand DL and improve its efficiency, robustness, and interpretability, we theoretically investigate optimization algorithms for training deep models and empirically explore deep learning based point cloud analysis and image classification. 1). Optimization for Training Deep Models: Neural network training is one of the most difficult optimization problems involved in DL. Recently, it has been attracting more and more attention to understand the global optimality in DL. However, we observe that conventional DL solvers have not been developed intentionally to seek for such global optimality. In this dissertation, we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. The proposed BPGrad algorithm is based on the assumption of Lipschitz continuity in DL, and as a result, it can adaptively determine the step size for the current gradient given the history of previous updates, wherein theoretically no smaller steps can achieve the global optimality. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation. 2). Deep Learning Based Point Cloud Analysis and Image Classification: The network architecture is of central importance for many visual recognition tasks. In this dissertation, we focus on the emerging field of point clouds analysis and image classification. 2.1) Point cloud analysis: We observe that traditional 6D pose estimation approaches are not sufficient to address the problem where neither a CAD model of the object nor the ground-truth 6D poses of its instances are available during training. We propose a novel unsupervised approach to jointly learn the 3D object model and estimate the 6D poses of multiple instances of the same object in a single end-to-end deep neural network framework, with applications to depth-based instance segmentation. The inputs are depth images, and the learned object model is represented by a 3D point cloud. Specifically, our network produces a 3D object model and a list of rigid transformations on this model to generate instances, which when rendered must match the observed point cloud to minimizing the Chamfer distance. To render the set of instance point clouds with occlusions, the network automatically removes the occluded points in a given camera view. Extensive experiments evaluate our technique on several object models and a varying number of instances in 3D point clouds. Compared with popular baselines for instance segmentation, our model not only demonstrates competitive performance, but also learns a 3D object model that is represented as a 3D point cloud. 2.2) Low quality image classification: We propose a simple while effective unsupervised deep feature transfer network to address the degrading problem of the state-of-the-art classification algorithms on low-quality images. No fine-tuning is required in our method. We use a pre-trained deep model to extract features for both high-resolution (HR) and low-resolution (LR) images, and feed them into a multilayer feature transfer network for knowledge transfer. An SVM classifier is learned directly using these transferred low-resolution features. Our network can be embedded into the state-of-the-art network models as a plug-in feature enhancement module. It preserves data structures in feature space for HR images, and transfers the distinguishing features from a well-structured source domain (HR features space) to a not well-organized target domain (LR features space). Extensive experiments show that the proposed transfer network achieves significant improvements over the baseline method.
dc.format.extent	124 pages
dc.language.iso	en
dc.publisher	University of Kansas
dc.rights	Copyright held by the author.
dc.subject	Artificial intelligence
dc.subject	Computer science
dc.subject	Computer Vision
dc.subject	Deep Learning
dc.subject	Image Classification
dc.subject	Optimization Algorithm
dc.subject	Point Cloud Analysis
dc.title	Optimization for Training Deep Models and Deep Learning Based Point Cloud Analysis and Image Classification
dc.type	Dissertation
dc.contributor.cmtemember	Kim, Taejoon
dc.contributor.cmtemember	Luo, Bo
dc.contributor.cmtemember	Yun, Heechul
dc.contributor.cmtemember	Chao, Haiyang
dc.thesis.degreeDiscipline	Electrical Engineering & Computer Science
dc.thesis.degreeLevel	Ph.D.
dc.identifier.orcid	https://orcid.org/0000-0002-0385-8016
dc.rights.accessrights	openAccess

Files in this item

Name:: Wu_ku_0099D_16905_DATA_1.pdf
Size:: 6.265Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

The University of Kansas prohibits discrimination on the basis of race, color, ethnicity, religion, sex, national origin, age, ancestry, disability, status as a veteran, sexual orientation, marital status, parental status, gender identity, gender expression and genetic information in the University’s programs and activities. The following person has been designated to handle inquiries regarding the non-discrimination policies: Director of the Office of Institutional Opportunity and Access, IOA@ku.edu, 1246 W. Campus Road, Room 153A, Lawrence, KS, 66045, (785)864-6414, 711 TTY.