Acceleration Techniques for Industrial Large Eddy Simulation with High-Order Methods on CPU-GPU Clusters

Jourdan de Araujo Jorge Filho, Eduardo

View/Open

JourdandeAraujoJorgeFilho_ku_0099D_17755_DATA_1.pdf (8.947Mb)

Issue Date

2021-05-31

Author

Jourdan de Araujo Jorge Filho, Eduardo

Publisher

University of Kansas

Format

147 pages

Type

Dissertation

Degree Level

Ph.D.

Discipline

Aerospace Engineering

Rights

Metadata

Show full item record

Abstract

One of the NASA's 2030 CFD Vision document key finding is that the use of CFD in the aerospace design process is severely limited by the inability to accurately and reliably predict turbulent flows with significant regions of separation. Scale-resolving simulations such as large eddy simulation (LES) are increasingly utilized with more complex problems such as flow over high lift configurations and through aircraft engines. The present work has the overall objective of reducing the computational cost of industrial LES. The high-order flux reconstruction (FR) method is used as the spatial discretization scheme. First, two acceleration techniques are investigated: the p-multigrid algorithm and Mach number preconditioning. The Weiss and Smith low Mach number preconditioner is used together with the p-multigrid method, and the third order explicit Runge-Kutta (RK3) scheme is considered as the smoother to reduce memory requirements. Mach number preconditioning significantly increased the efficiency of the p-multigrid method. For unsteady simulations, the preconditioner helped with the efficiency of the p-multigrid with larger physical time steps. In most steady cases, the preconditioned p-multigrid approach is comparable to or faster than the implicit LU-SGS algorithm and requires less memory, specially for p 2 schemes. An efficient implementation of the FR method is done for modern GPU clusters and the speedup is investigated for different polynomial orders and cell types. Approaches to improve the parallel efficiency of multi-GPU simulations are also studied. The simulation node-hour cost on the Summit supercomputer is reduced by a factor of 50 for hexahedron cells and up to 200 for tetrahedron cells. Two low memory implicit time integration methods are implemented on GPUs: the matrix-free GMRES solver and a novel local GMRES-SGS method. Parametric studies are done to evaluate their performance on LES benchmark cases. On the High-Lift Common Research Model case for the 2021 4th AIAA High-Lift Prediction Workshop, both GPU implicit time methods provide an additional speedup of 14 and 68, respectively, over the GPU explicit time simulation.

URI

http://hdl.handle.net/1808/32562

Collections

Dissertations [4889]

The University of Kansas prohibits discrimination on the basis of race, color, ethnicity, religion, sex, national origin, age, ancestry, disability, status as a veteran, sexual orientation, marital status, parental status, gender identity, gender expression and genetic information in the University’s programs and activities. The following person has been designated to handle inquiries regarding the non-discrimination policies: Director of the Office of Institutional Opportunity and Access, IOA@ku.edu, 1246 W. Campus Road, Room 153A, Lawrence, KS, 66045, (785)864-6414, 711 TTY.