Frontal Face Sythesis from Multiple Low-resolution Images

INV-20067

Background

Face-based generative tasks (e.g., face rotation, hallucination, and attribute editing) have gained more traction in research communities with advancements in deep learning. However, the practical significance of identity-preservation is frequently overlooked, which generates significant challenges for face images with large poses and low-quality. With regard to this, some researchers recently made progress in the synthesis of frontal faces with large pose variations. However, past works focusing on faces with large poses assume images of high-quality. Thus, existing methods suffer from identity information loss when learning a highly non‑linear transformation that maps spaces of low-resolution (LR) side-views to high-resolution (HR) frontal-views.

Either low quality of input or large pose discrepancy between views makes frontalization a challenging problem. It is often more difficult to synthesize accurate frontal faces with a single LR image under extreme poses. Thus, it is necessary to have a model that accepts both one and multiple inputs and improves with each sample added.

Technology Overview

Northeastern University researchers have invented a novel super-resolution (SR) integrated generative adversarial network (SRGAN), which learns the face frontalization and super-resolution collaboratively to synthesize high-quality, identity-preserved frontal faces. It learns using a generator network consisting of a deep encoder and an SR integrated decoder. Features extracted by the deep encoder are passed to the decoder for reconstruction. The decoder is specially designed to first super-resolve side-view images, and ultimately utilize the information to reconstructs HR frontal faces. To train the model, the three-level loss (i.e., pixel, patch, and global) provides fine-to-coarse coverage that learns a precise non-linear transformation between LR side-view and HR frontal-view faces. Moreover, SRGAN accepts multiple LR profile faces as input by adding an orthogonal constraint in the generator to penalize redundant latent representations and diversify the learned features space. Using these techniques, this invention can generate high-quality frontal faces.

Benefits

- It generates more realistic faces with increased identity-preservation and better image quality (sharper and clearer details)

- It integrates a super-resolution module into a network to provide fine details of side-views in high-resolution space, which helps the model reconstruct high-frequency information of faces (i.e., periocular, nose, and mouth regions)

- Introduces a three-level loss (i.e., pixel, patch, and global-based) to learn more precise non-linear transformations from low-resolution side-views to high-resolution frontal views

- It accepts multiple faces as input to further boost the quality of generated faces

Applications

- Can be applied in generating criminal’s frontal faces

- Can be applied in many interesting apps for entertainment such as editing the pose of faces in photos

- Can be utilized to help face recognition in a surveillance system

Opportunity

- License

- Partnering

- Research collaboration