VBMH - Virtual Background Matting for Multi-Human Video Conferencing with Privacy Preservation

XIN Jiang, DAI Ruyi / December 2022

Video Matting Attention Mechanisms Computer Vision Deep Learning

In collaboration with Dr. Xin Jiang, this project aims to develop an advanced deep learning-based system for video background matting, focusing on scenarios involving multiple human subjects in video conferencing settings. The primary goal is to ensure the privacy of the participants by isolating the main speaker and filtering out other individuals in the video.

To address this challenge, the system incorporates attention mechanisms and leverages pre-sampled images of the conference participants. Current approaches face difficulties in handling this task, as simple fusion techniques prove inadequate for the required fine-grained recognition. Therefore, the project explores the development of more sophisticated attention mechanisms, such as transformers, to achieve higher accuracy in distinguishing the main speaker.

Multi-person recognition demands a higher level of attention mechanism design to effectively differentiate between the main speaker and other individuals in the video. By refining these attention mechanisms, the system aims to provide a more reliable and accurate video background matting solution for multi-human video conferencing with privacy preservation.

Dataset Preparation and Processing