1. Tracking Dataset Format

1.1 Raw Image Sets

The raw image sets contain RGB images in PNG format, each of which is in resolution of 500 (height) * 1000 (width). The raw image set "ParallelEye_rgb_train.rar" and corresponding tracking ground truth set "ParallelEye_motgt_train.rar" can be downloaded from the homepage.

1.2 Ground Truth Sets

For the ground truth txt file given by the dataset, the following data are involved in tracking task. The origin of the image resides at the bottom-left corner. The first column shows the frame number, the second column shows the tracking ID, and the third column shows the object class label. Columns 7-10 indicate the left, top, right, and bottom coordinates of the tracked object. The remaining values are represented by cx, cy, cz, w3d, h3d, l3d, y3d, z3d and occupancy.

frame tid label cx cy cz l t r b
1 0 Car -181.204 2.967041 183.5051 469 241 515 208
w3d h3d l3d x3d y3d z3d occupancy
1 1.31 1.690001 -162.685 1.856378 186.2169 0

*Note: Only 2D tracking is evaluated in this challenge.

2. Tracking Task

For each image of the test set, the targets of interest should be tracked as accurately as possible. The target position can be represented as competely as possible with bounding box, and the background pixels should be included as few as possible by the bounding box, and different targets are represented with different IDs. A visual diagram of the tracking result is shown below.

Figure 1. Example of tracking data. Left: raw image. Right: tracking ground truth.

3. Evaluation

3.1 Data Supplied

The test sets ("tracking_testing_set.rar" in homepage) contain 10220 RGB images in the same PNG format with training images.

3.2 Submission of Results

The submitted results need to be put in a txt file, in the order of

frame number, object ID, left, top, width, height

*Note: Make sure the names of the image resides attracking result folders and files same as the bottom-left corner datasets given.

Here is an example:
    04_sunset.txt:
    1,1,912,484,97,109
    1,2,1338,418,167,379
    1,3,586,447,85,263
    …
    2,1,912,484,97,109
    …

3.3 Evaluation Criteria

The tracked objects are of three classes: car, bus and truck. The CLEAR MOT [1] and Mostly-Tracked/Partly-Tracked/Mostly-Lost [2] metrics are used to evaluate the tracking performance, and the evaluation indices include:
MOTA (the multiple object tracking accuracy), which is calculated as follow:

    mt describes the number of misses.
    fpt describes the number of false positives.
    mmet describes the number of mismatches.
MOTP (the multiple object tracking precision) ,which is calculated as:

The numerator is the total error in estimated position for matched object-hypothesis pairs over all frames, and the denominator means the total number of matches made.
MT (Mostly Tracked): Percentage of GT trajectories that are covered by tracker output for more than 80% in length.
ML (Mostly Lost): Percentage of GT trajectories that are covered by tracker output for less than 20% in length. The smaller the better.
Frag (Fragments): The total of number of times that a groundtruth trajectory is interrupted in tracking result. The smaller the better.
IDS (ID Switches): The total number of times that a tracked trajectory changes its matched GT identity. The smaller the better.
FP (False Positive): The total number of predicted detections that do not match the groundtruth trajectory. The smaller the better.
FN (False Negative): The total number of groundtruth points that do not match the predicted results. The smaller the better.
[1] K. Bernardin and R. Stiefelhagen. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, and R. Nevatia. Learning to Associate: HybridBoosted Multi-Target Tracker for Crowded Scene. CVPR 2009.