Person Reidentification Using Spatiotemporal Appearance

Introduction

  • biometric [16] [22]
  • assume : no change of cloth
  • invariant sig :
    • color histogram
    • clothing color
    • brightness transfer(compare after calibrated)
  • local signature: spatial, some variation
    • interest point op
    • model fitting
  • two aspects
    • correspondences
    • gen signature & match

Overview

two approaches: interest point+model fitting

variant appearance <= large number of responses (inc likelihood of true correspondences)

model based <= decomposable triangulated graph => model shape

  • localize different body parts => facilitate

Information : color(histogram: hue, saturation & normalization), structure(spatial segmentation & description)

Signature Generation

Two parts:

  • histogram of hue & saturation, HUE: \[H=\arccos \frac{\log (R) -\log (G)}{\log (R) +\log (G) -2\log (B)}\]

Trad: \[H=\arctan \frac{0.5[(R-G)+(R-B)]}{\sqrt{(R-G)(R-G)+(R-B)(G-B)}}\]

  • structural qualities
    • foreground interior edgels
    • Distance: \(D(h_i,h_j)=1-2\frac{\sum^n_{k=1}min(h_i(k),h_j(k))}{\sum^n_{k=1}h_i(k)+h_j(k)}\)

Spatiotemporal Segmentation

  • group pixels of the same type of fabric
  • Overseg => \(G={V,E}\) => v rep continuous region
    • Sobel op1 => foreground => gaussian filter
    • watershed segmentation algor [21]
    • edge => spatial (watershed), temporal (same region at time t+1):
      • estimate of motion field
      • \(F_{N,t}(x,y) = \sum^N_{k=0}H(I_t(x,y)-I_{t+k}(x,y))\)
      • I => intensity at time subscript
      • \(H(z) = \begin{cases} 1, & |z|<\delta \\ 0, & \mbox{otherwise} \end{cases}\)
      • Overlapping region with highest freq integral
      • Weight on edge: the cost of grouping two regions together
        • \(w^{t,t}_{i,i'}=|M(i,t) - M(i',t)|\)
        • \(w^{t,t+1}_{i,i'}=\frac{1}{3}|M(i,t) - M(i',t+1)|\)
        • M => median intensity value
  • search for a minimal spanning tree

Graph Partitioning

  • ten consecutive frame
  • Group by: distance less than internal variation
    • Internal Variation: \(I(C) = max(w^{t,t'}_{i,i'}, s.t. e^{t,t'}_{i,i'}\in E^C)\)
    • Inter-cluster distance: \(D(C_m,C_n)=min(w^{t,t'}_{i,i'}, s.t. v^t_i \in V^{C_m}, v^{t'}_{i'} \in V^{C_n}, e^{t,t'}_{i,i'}\in E)\)
    • Merging: \(D(C_m,C_n)\leq MI(C_m,C_n)\) where \(MI(C_m,C_n)=min(I(C_m)+\kappa / |C_m|,I(C_n)+\kappa / |C_n|)\)

Foreground-Background Separation

  • compute the maximum frquency image: \(MF_{N,t}(i,j)=max(F_{N,t}(i,j),F_{-N,t}(i,j))\)
  • thresholding & morphological filtering

Interest-Point Matching

  • Hessian Affine invariant interest-point operator2
  • Compare using distance measure and thresholding

Dynamic-Programming Model Fitting

Model-based match respective body parts

  • aleviate the problem of changing relative location
  • use model to segment body parts

Engergy function(minimize):

\[E(g,I) = \sum_i E_i(g_i,I) = \sum_i E^{data}_i(g_i,I) + E^{shape}_i(g_i)\]

  • Shape costs defined by polar decomp of its affine transformation
    • affine matrix A, polar decomp:
    • \(A=\begin{bmatrix} cos \psi & -sin\psi \\ sin \psi & cos \psi \end{bmatrix} \begin{bmatrix} s_x & x_h \\ s_h & s_y \end{bmatrix} = R(\psi ) S\)
    • S => scale-shear matrix R closest possible rotation matrix to A(Frobenuis matrix norm)
    • \(E^{shape} = log(\frac{\lambda_1}{\lambda_2})^2 + log(1+ s_h)^2\)
    • \(\lambda_1,\lambda_2\) => eigenvalues
    • 1st height-width ratio, 2nd shear
  • data cost attracts model to salient image features
    • Only for boundary edges
    • less sensitive to spurious edges then to missed ones
    • Canny edge detection, \(E^{edge} = \frac{1}{n} \sum^{n}_{i=1}D(x_i,y_i)\)
    • Foreground mask:
    • \(E^{fg} = 1 - |\frac{N^{fg}_1}{N_1} - \frac{N^{fg}_2}{N_2}|\)
    • total number of pixels and total number of pixels on one side of the window
  • dynamic-programming: 0-1 backpack
  • computed and compared using histogram eq

  1. Sobel op:\[G_y = \begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ +1 & +2 & +1 \end{bmatrix} * A,G_x = \begin{bmatrix} -1 & 0 & +1 \\ -2 & 0 & +2 \\ -1 & 0 & +1 \end{bmatrix} * A\]

  2. Wikipedia