SwanseaVision Site/3-D Human Pose Tracking

3-D Human Pose Tracking

Representing articulated objects as a graphical model has gained much popularity in recent years, often the root node of the graph describes the global position and orientation of the object. In this work, we present a method to robustly track 3-D human pose by permitting larger uncertainty to be modeled over the root node than existing techniques allow. Significantly, this is achieved without increasing the uncertainty of remaining parts of the model. The benefit is that a greater volume of the posterior can be supported making the approach less vulnerable to tracking failure. Given a hypothesis of the root node state a novel method is presented to estimate the posterior over the remaining parts of the body conditioned on this value. All probability distributions are approximated using a single Gaussian allowing inference to be carried out in closed form. A set of deterministically selected sample points are used that allow the posterior to be updated for each part requiring only seven image likelihood evaluations making it extremely efficient. Multiple root node states are supported and propagated using standard sampling techniques. We believe this to be the first work devoted to efficient tracking of human pose whilst modeling large uncertainty in the root node. The proposed method is more robust to tracking failures than existing approaches.

Proposed Method

1. To perform efficient tracking the body is decomposed into its constituent parts which allows it to be represented over a probabilistic graph. The nodes are partitioned into the root node, representing the global position and orientation of the body, and the remaining nodes representing the orientation of each part.

The graphical structure used to represent the body, comprising of the head (H), torso (Tor), left upper arm (LUA), left lower arm (LLA), left upper leg (LUL), left lower leg (LLL) and the opposing part for each limb.

2. The state of each node, excluding the root node, is represented as a quaternion rotation that describes the orientation of each part in the frame of reference of the body, where the base of the torso is the origin, the z-axis is the vertical and y-axis is directed across the shoulders. A distribution over quaternions is then approximated.

3. The posterior distribution over the root node is represented by a set of samples. For each sample, a set of Gaussians are used to represent the posterior for each part conditioned on the given root node state. The parameters of each distribution are updated in each frame using a set of deterministically selected sample points.

An example of a set of sample points used to estimate observational likelihood distributions projected into two views. They represent the distributions shown on the left.

4. Combining these with limb conditionals, that represent the prior distribution over the configuration between connected parts, efficient probabilistic inference can be performed.

5. Whilst the posterior distribution over the root node is propagated through time stochastically, the distribution over all other nodes are propagated by inflating the covariances deterministically.

Example frames showing the distribution of the samples using the SIR-PF (top) and the proposed method (bottom).

Example results - pose estimation errors measured in mm using 3 cameras

Method	S1	S2	S3	Average
APF	194.2	75.0	87.7	118.9 +/- 65.5
SIR-PF	105.1	93.0	109.2	102.5 +/- 8.4
Proposed method	87.3	95.2	98.5	93.7 +/- 5.8

Example results - tracking errors measured in mm using 3 cameras


Tracking error in each frame for the APF (blue) and the proposed method (red).	Tracking errors for walking using each method applied to a different frame rate.

Example results - pose estimation errors measured in mm using 2 cameras

Method	S1	S2	S3	Average
APF	200.7	120.0	117.9	146.2 +/- 47.2
SIR-PF	105.1	105.2	120.7	110.4 +/- 8.9
Proposed method	89.3	108.7	113.5	103.8 +/- 12.8

Example results - the MAP 3-D pose using the proposed method

Example frames showing the MAP 3-D pose projected into each camera view.

Example results - pose estimation results

Comparison of pose estimation between the SIR-PF (top) and proposed method (bottom).

Publications

Ben Daubney and Xianghua Xie, Tracking 3D Human Pose with Large Root Node Uncertainty, In Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011.