Show simple item record

dc.contributor.authorGarg, Saurabh
dc.contributor.authorHamarneh, Ghassan
dc.contributor.authorJongman, Allard
dc.contributor.authorSereno, Joan A.
dc.contributor.authorWang, Yue
dc.identifier.citationGarg, S., Hamarneh, G., Jongman, A., Sereno, J. A., & Wang, Y. (2020). ADFAC: Automatic detection of facial articulatory features. MethodsX, 7, 101006.
dc.descriptionThis work is licensed under a Creative Commons Attribution 4.0 International License.en_US
dc.description.abstractUsing computer-vision and image processing techniques, we aim to identify specific visual cues as induced by facial movements made during monosyllabic speech production. The method is named ADFAC: Automatic Detection of Facial Articulatory Cues. Four facial points of interest were detected automatically to represent head, eyebrow and lip movements: nose tip (proxy for head movement), medial point of left eyebrow, and midpoints of the upper and lower lips. The detected points were then automatically tracked in the subsequent video frames. Critical features such as the distance, velocity, and acceleration describing local facial movements with respect to the resting face of each speaker were extracted from the positional profiles of each tracked point. In this work, a variant of random forest is proposed to determine which facial features are significant in classifying speech sound categories. The method takes in both video and audio as input and extracts features from any video with a plain or simple background. The method is implemented in MATLAB and scripts are made available on GitHub for easy access. • Using innovative computer-vision and image processing techniques to automatically detect and track keypoints on the face during speech production in videos, thus allowing more natural articulation than previous sensor-based approaches.

• Measuring multi-dimensional and dynamic facial movements by extracting time-related, distance-related and kinematics-related features in speech production.

• Adopting the novel random forest classification approach to determine and rank the significance of facial features toward accurate speech sound categorization.
dc.description.sponsorshipHumanities Research Council of Canada (SSHRC Insight Grant 435–2012–1641)en_US
dc.description.sponsorshipNatural Sciences and Engineering Research Council of Canada (NSERC Discovery Grant 2017–05978)en_US
dc.rights© 2020 The Authors. Published by Elsevier B.V.en_US
dc.subjectVisual cuesen_US
dc.subjectFacial movementsen_US
dc.subjectDiscriminative analysisen_US
dc.subjectComputer visionen_US
dc.subjectImage processingen_US
dc.titleADFAC: Automatic detection of facial articulatory featuresen_US
kusw.kuauthorJongman, Allard
kusw.kuauthorSereno, Joan A.
kusw.kudepartmentKU Phonetics and Psycholinguistics Laben_US
kusw.oaversionScholarly/refereed, publisher versionen_US
kusw.oapolicyThis item meets KU Open Access policy criteria.en_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

© 2020 The Authors. Published by Elsevier B.V.
Except where otherwise noted, this item's license is described as: © 2020 The Authors. Published by Elsevier B.V.