Person Re-Identification

Person Re-Identification

An example of 2d+3d person re-identification system

Person re-identification (in short, re-id) consists in recognizing a same individual in diverse locations and time over several non-overlapping camera views. The re-identification task is fundamental for a set of surveillance applications specially when dealing with large and structured environments such as museums, shopping malls, airports, etc. The task is challenging because the recognition must be robust to changes in the perspective view, human pose, lighting variability, and occlusions.

This topic mainly focuses on exploring new descriptors and methods to deal with the re-identification task. As cameras do not often provide sufficient resolution to work with facial or iris recognition, the classical solutions normally rely on appearance information, i.e., clothing and accessories. These appearance-based methods lie primarily on designing and building the person signature by extracting features from the whole region or specific parts of the human body. Further, learning techniques can be also utilised to increase re-id accuracy.

We address this task in the large, by developing purely appearance-based methods (e.g., like SDALF) as well as metric and transfer learning approaches, for modelling the entire re-id process or estimating the brightness transfer function (BTF) among the cameras, respectively.

Moreover, thanks to novel camera technologies such as RGB-D cameras (Microsoft Kinect® or Asus Xtion Pro®), able to acquire depth together with RGB data, re-id can also be approached using 3D soft-biometrics and other geometrical information that can be extracted from such type of data.

Our main works in this respect are listed in the following.






Unsupervised Adaptive Re-identification in Open World Dynamic Camera Networks

Most re-id approaches have neglected the dynamic and open world nature of the re-identification problem, where a new camera may be temporarily inserted into an existing system to get additional information. To address such a novel and very practical problem, we propose an unsupervised adaptation scheme for re-identification models in a dynamic camera network. First, we formulate a domain perceptive re-identification method based on geodesic flow kernel that can effectively find the best source camera (already installed) to adapt with a newly introduced target camera, without requiring a very expensive training phase. Second, we introduce a transitive inference algorithm for re-identification that can exploit the information from best source camera to improve the accuracy across other camera pairs in a network of multiple cameras. Extensive experiments on four benchmark datasets demonstrate that the proposed model significantly outperforms the state-of-the-art unsupervised learning based alternatives whilst being extremely efficient to compute.

Unsupervised Adaptive Re-identification in Open World Dynamic Camera Networks

Unsupervised adaptation scheme for re-identification models while introducing new Camera (C3) into the Existing Camera Network (C1 & C2).

 

Reference:

  • Amran Bhuiyan, Rameswar Panda, Amit K. Roy Chowdhury, Vittorio Murino
    "Unsupervised Adaptive Re-identification in Open World Dynamic Camera Networks"
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017





Distance Penalization for Person Re-identification

We take advantage of the several feature descriptors available in the literature to devise a fusion approach which, taking as input the several descriptors composed by the distances of the probe from all the gallery images, re-rank them on the basis of their confidence to build a dictionary, followed by a sparse coding approach to get the final ranking. More specifically, the processing pipeline is composed of two stages. First, a metric learning paradigm is applied on a bunch of distinct feature extractors to produce an ensemble of estimated distance measures, which are afterwards penalized according to their confidence in estimating the correct matches and averaged to draw a final decision. Second, the closest persons from the gallery are selected based on the previously fused distance measures, and utilized to span a dictionary to reconstruct the probe image queried to the system. Evaluated on benchmark datasets, the proposed framework advances the state-of-the-art by interesting margins. In particular, Rank-1 gains amounting to about 12%, 1%, 6%, and 12%, were scored on VIPeR, CAVIAR4REID, iLIDS, and 3DPeS, respectively.

Distance Penalization for Person Re-identification

Processing pipeline of our distance penalization approach

 

Reference:

  • Behzad Mirmahboub, Mohamed Lamine Mekhalfi, Vittorio Murino
    "Distance Penalization for Person Re-identification"
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2017





Person re-identification using sparse representation with manifold constraints

Nowadays, surveillance cameras with high frame rate are capable of capturing several consecutive frames from each person. Images, in multi-shot scenarios, provide richer information of the target person compared to single-shot conditions. They, however, produce a high cost as per information redundancy, which may degrade the performance of re-id systems. In this paper, we propose a novel framework that combines sparse coding and manifold constraints to extract discriminative information from multi-shot images of one pedestrian for person re-id across a set of non-overlapped surveillance cameras. The evaluation over two standard multi-shot datasets shows very competitive accuracy of our framework against the state-of-the-art.

Person re-identification using sparse representation with manifold constraints

Addition of manifold constraint to summarize several similar images into one point

Person re-identification using sparse representation with manifold constraints

Schematic representation for estimating the manifold point.

 

Reference:

  • B. Mirmahboub, H. Kiani, A. Bhuiyan, A. Perina, B. Zhang, A. Del Bue, V. Murino
    "Person Re-Identification Using Sparse Representation With Manifold Constraints"
    International Conference on Image Processing (ICIP), 2016 [PDF]





Exploiting multiple detections to learn robust brightness transfer functions in re-identification systems

One of the most relevant problems in re-identification systems is that the appearance of same individual varies across cameras due to illumination and viewpoint changes. This paper proposes the use of Cumulative Weighted Brightness Transfer Functions to model these appearance variations. It is multiple frame based learning approach, which leverages consecutive detections of each individual to transfer the appearance, rather than learning brightness transfer function from pairs of images. We tested our approach on standard multi-camera surveillance datasets showing consistent and significant improvements over existing methods on three different datasets without any other additional cost. Our approach is general and can be applied to any subsequent appearance-based method.

Exploiting multiple detections to learn robust brightness transfer functions in re-identification systems

a) Example of a typical indoor camera network system and associated captured images.
b) Overview of CWBTF approach.

 

Reference:

  • A. Bhuiyan, A. Perina, V. Murino
    "Exploiting multiple detections to learn robust brightness transfer functions in re-identification systems"
    IEEE International Conference on Image Processing, (ICIP), 2015 [PDF]





Person re-identification by discriminatively selecting parts and features

This paper presents a novel appearance-based method for person re-identification. The core idea is to rank and select different body parts on the basis of the discriminating power of their characteristic features. In our approach, we first segment the pedestrian images into meaningful parts, then we extract features from such parts as well as from the whole body and finally, we perform a salience analysis based on regression coefficients. Given a set of individuals, our method is able to estimate the different importance (or salience) of each body part automatically. To prove the effectiveness of our approach, we considered two standard datasets and we demonstrated through an exhaustive experimental phase how our method improves significantly upon existing approaches, especially in multi-shot scenarios.

Person re-identification by discriminatively selecting parts and features

a) Lineups of pedestrians. b) Stel Component Analysis (SCA) Segmentations for each pedestrians. c) Original Images overlapping with the segmentations.

Person re-identification by discriminatively selecting parts and features

Multiple (gallery) images of a same pedestrian. The regression coefficients summed up across the features highlight the most important parts to identify that pedestrian. Coefficients are then ranked and only the top P are retained (P = 2 in this case).

 

Reference:

  • A. Bhuiyan, A. Perina, V. Murino
    "Person re-identification by discriminatively selecting parts and features"
    European Conference on Computer Vision, pp. 147-161, 2014 [PDF]





Semi-supervised Multi-feature Learning for Person Re-identification

Learning approaches for re-id are usually based on simple features, and are trained on camera pairs to discriminate between individuals. In this paper, we present a method that joins these two ideas: given an arbitrary state-of-the-art set of features, no matter their number, dimensionality or descriptor, the proposed multi-class learning approach learns how to fuse them, ensuring that the features agree on the classification result. The approach consists of a semi-supervised multi-feature learning strategy that requires at least a single image per person as training data. To validate our approach, we present results on different datasets, using several heterogeneous features, setting a higher level of performance in the person re-identification problem, even in very poor settings.

Semi-supervised Multi-feature Learning for Person Re-identification

Overview of the proposed method. The features of each individual are extracted from labeled and unlabeled images. Then, kernels are computed for each feature and finally the multi-feature classifier is trained.

 

Reference:

  • Dario Figueira, Loris Bazzani, Ha ́Quang Minh, Marco Cristani, Alexandre Bernardino, Vittorio Murino
    "Semi-supervised Multi-feature Learning for Person Re-identification"
    IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2013





Person Re-identification with a PTZ Camera: an introductory study

We present an introductory study that paves the way for a new kind of person re-identification, by exploiting a single Pan-Tilt-Zoom (PTZ) camera. PTZ devices allow to zoom in on body regions, acquiring discriminative visual patterns that enrich the appearance description of an individual. This intuition has been translated into a statistical direct re-identification scheme, which collects two images for each probe subject: the first image captures the probe individual, focusing on the whole body; the second can be a zoomed body part (head, torso or legs) or another whole body image, and is the outcome of an action-selection mechanism, driven by feature selection principles. The validation of this technique is also explored: in order to allow repeatability, two novel multi-resolution benchmarks have been created. On these data, we demonstrate that our approach selects effective actions, by focusing on the body part that better discriminate each subject. Moreover, we show that the proposed compound of two images overwhelms standard multi-shot descriptions, composed by many more pictures.

Person Re-identification with a PTZ Camera: an introductory study

The proposed method first detects the whole body of the probe and a first round of re-id is performed. The max-var action selection algorithm exploits this information to select the most informative part and performing a second attempt of re-identification that combines the two images.

 

Reference:

  • Pietro Salvagnini, Loris Bazzani, Marco Cristani, Vittorio Murino
    "Person Re-identification with a PTZ Camera: an introductory study"
    IEEE International Conference on Image Processing (ICIP), 2013





Re-identification with RGB-D Sensors

Person re-identification is mostly addressed considering by primarily exploiting appearance cues coming from 2D images, hypothesizing that the individuals cannot change their clothes. In this paper, we relax this constraint by presenting and exploiting a set of 3D soft-biometric cues invariant to appearance variations, and gathered using RGB-D technology. The joint use of these characteristics provides encouraging performances on a benchmark of 79 people that have been captured in different days and with different clothing. This promotes a novel research direction in re-identification, supported also by the fact that a new brand of affordable RGB-D cameras have recently invaded the worldwide market.

Re-identification with RGB-D Sensors

Processing pipeline RGB-D sensors based re-identification.

Re-identification with RGB-D Sensors

a) Illustration of the different groups in the recorded data, rows from top to bottom: “Walking", “Walking2", “Backwards" and “Collaborative". b) Skeleton with Joint Ids. c) Features (distances) employed for building the soft-biometric cues (in black), and some of the soft biometric features (in green). d) Statistics of the “Walking" dataset: for each feature, the associated histogram is shown (in the parenthesis, its mean value and standard deviation in cm).

 

Reference:

  • Igor Barros Barbosa, Marco Cristani, Alessio Del Bue, Loris Bazzani, Vittorio Murino
    "Re-identification with RGB-D Sensors"
    International Workshop on Re-Identification Re-Id 2012 – A Satellite Workshop of the European Conference on Computer Vision (ECCV), 2012





Custom Pictorial Structures for Re-identification

We propose a novel methodology for re-identification, based on Pictorial Structures (PS). Whenever face or other biometric information is missing, humans recognize an individual by selectively focusing on the body parts, looking for part-to-part correspondences. We want to take inspiration from this strategy in a re-identification context, using PS to achieve this objective. For single image re-identification, we adopt PS to localize the parts, extract and match their descriptors. When multiple images of a single individual are available, we propose a new algorithm to customize the fit of PS on that specific person, leading to what we call a Custom Pictorial Structure (CPS). CPS learns the appearance of an individual, improving the localization of its parts, thus obtaining more reliable visual characteristics for re-identification. It is based on the statistical learning of pixel attributes collected through spatio-temporal reasoning. The use of PS and CPS leads to state-of-the-art results on all the available public benchmarks, and opens a fresh new direction for research on re-identification.

Custom Pictorial Structures for Re-identification

a) Single-shot PS. Multi-shot CPS at iteration. b) Initial PS fitting. c) The parts are aligned and per-pixel statistics is collected employing spatio-temporal reasoning. d) The ad-hoc part detectors are estimated, and their means are shown: at every iteration up to L, the fitting becomes more accurate due to the improved part detectors.

 

Reference:

  • D. S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, V. Murino
    "Custom pictorial structures for re-identification"
    22nd British Machine Vision Conference (BMVC), 2011





Person Re-Identification by Symmetry-Driven Accumulation of Local Features

We present an appearance-based method for person re-identification. It consists in the extraction of features that model three complementary aspects of the human appearance: the overall chromatic content, the spatial arrangement of colors into stable regions, and the presence of recurrent local motifs with high entropy. All this information is derived from different body parts, and weighted opportunely by exploiting symmetry and asymmetry perceptual principles. In this way, robustness against very low resolution, occlusions and pose, viewpoint and illumination changes is achieved. The approach applies to situations where the number of candidates varies continuously, considering single images or bunch of frames for each individual. It has been tested on several public benchmark datasets (ViPER, iLIDS, ETHZ), gaining new state-of-the-art performances.

Person Re-Identification by Symmetry-Driven Accumulation of Local Features

Sketch of the proposed descriptor. (a,a’) Given an image or a set of images, (b,b’) SDALF localizes meaningful body parts. Then, complementary aspects of the human body appearance are extracted, that is: (c,c’) weighted HSV histogram, represented here by its (weighted) back-projection (brighter pixels mean a more important color), (d,d’) maximally stable color regions, and (e,e’) recurrent highly structured patches. The objective is to correctly match SDALF descriptors of the same person in different views (a vs. a’).

 

Reference:

  • M. Farenzena, L. Bazzani, A. Perina, V. Murino, M. Cristani
    "Person re-identification by symmetry-driven accumulation of local features"
    23rd IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010






Additional references:

  • B. Mirmahboob, M. L. Mekhalfi, V. Murino.
    "Distance penalization and fusion for person re-identification"
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2017

  • A. Bhuiyan, B. Mirmahboub, A. Perina, V. Murino
    "Person re-identification using robust brightness transfer functions based on multiple detections"
    18th International Conference on Image Analysis and Processing, Genova, Italy, 7-11 September 2015 [PDF]

  • L. Bazzani, M. Cristani, A. Perina, V. Murino
    "Multiple-shot person re-identification by chromatic and epitomic analyses"
    Pattern Recognition Letters, 33(7):898-903, May 2012