A team at the Korea Advanced Institute of Science and Technology (KAIST) has developed a technique for facial expression detection by merging near-infrared (NIR) light-field camera imaging techniques with AI technology.
Unlike a conventional camera, the light-field camera contains micro-lens arrays in front of the image sensor, which makes the camera small enough to fit into a smartphone while allowing it to acquire the spatial and directional information of the light with a single shot.
The technique can reconstruct images in a variety of ways, including multiviews, refocusing, and 3D image acquisition. However, the optical crosstalk between shadows caused by external light sources in the environment and the micro-lens has limited existing light-field cameras from being able to provide accurate image contrast and 3D reconstruction.
The joint research teams of Ki-Hun Jeong and Doheon Lee from the KAIST Department of Bio and Brain Engineering applied a VCSEL in the NIR range to stabilize the accuracy of 3D image reconstruction that previously depended on environmental light. When an external light source is shone on a face at 0°, 30°, and 60° angles, the light-field camera reduces 54% of image reconstruction errors.
Additionally, by inserting a light-absorbing layer for visible and NIR wavelengths between the micro-lens arrays, the team minimized optical crosstalk while increasing the image contrast by 2.1 times.
A facial expression reading based on multilayer perceptron classification from 3D depth maps and 2D images obtained by NIR-based light-field camera. Courtesy of KAIST.
The researchers overcame the limitations of existing light-field cameras and was able to develop their NIR-based light-field camera (NIR-LFC) optimized for the 3D image reconstruction of facial expressions. Using the NIR-LFC, the team acquired high-quality 3D reconstruction images of facial expressions showing various emotions regardless of the lighting conditions of the surrounding environment.
The team then used machine learning to distinguish facial expressions in the acquired 3D images at an average of 85% accuracy — a statistically significant figure compared with the use of 2D images. By calculating the interdependency of distance information that varies with facial expression in 3D images, the team was also able to identify the information that a light-field camera uses to distinguish human expressions.
Mobile health care, field diagnosis, social cognition, and human-machine interactions are among the numerous potential applications that Jeong identified for the technique.
The research was funded by the Ministry of Science and ICT (South Korea) and the Ministry of Trade, Industry, and Energy (South Korea).
The research was published in Advanced Intelligent Systems (www.doi.org/10.1002/aisy.202100182).