Multi-Sensory/Acoustic Data Processing

Acoustic Signal Processing

Speech, sounds, and audio events are vectors of information which are very important in many contexts and applications. However, the exploitation of this information needs the correct acquisition and spatial processing of the propagating acoustic wavefield. The microphone array is a flexible tool which makes possible the directional capture of sounds (to improve signal quality in noisy environments), as well as the localization of the acoustic sources of interest and the generation of acoustical images (to map in real-time  acoustic events inside a given space).

Acoustic signal processing

PAVIS is active in array signal processing, with special emphasis on spatial processing of acoustic signals. Our current research encompasses array synthesis, beamforming methods and acoustical imaging. In particular, great attention is paid to the synthesis of optimized arrays, sparse and aperiodic, in connection with robust superdirective beamforming for wideband signals, implemented in the frequency domain. These array signal processing strategies are the basis for the design of innovative acoustic imaging systems which work in near-field and generate 3-D maps in real-time. Our team carries out research in all of these topics through a combination of theoretical investigation and experimental testing.

In addition to signal capturing and spatial processing, research activities devoted to high-level processing and understanding of acoustic signals are carried out as well. Audio foreground extraction, audio event classification, and joint audio-video calibration and processing are examples of some current research topics.

Multi sensor data fusion

Multi-sensor data fusion is an emerging research field whose aim  is to combine information from multiple and diverse sources (e.g. different sensors – thermal and visible spectrum cameras, laser, range sensors, microphones, RFID  etc.) to achieve inferences that cannot be obtained from a single sensor or source, or whose quality exceeds that of an inference drawn from any single source. To cite a few examples: person identification can be improved through a combination of audio (voice) and video (silhouette) cues, or object tracking in adverse weather conditions can take advantage from the fusion of thermal and visible camera images.

Multi sensor data fusion

Multi-sensor data fusion is inherently a multi-disciplinary subject that draws from such areas as statistical estimation, signal processing, computer vision and machine learning.

PAVIS is concerned with the development  of multi-sensor data fusion techniques mainly for automated surveillance applications. In this context, different tasks such as person detection, tracking and re-identification, behavior analysis, and high level scene understanding are addressed, with the aim to investigate potential improvements with a multi-sensor set up through a combination of theoretical analysis and experimental testing.

Due to the multifarious nature of the sensor devices adopted, there are strong interactions with other research areas of PAVIS, such as Acoustic Signal Processing and Visual Geometry and Modeling.