model.fairmot

Description

🎯 Human detection and tracking model that balances the importance between detection and re-ID tasks.

class Node(config=None, **kwargs)[source]

Initializes and uses FairMOT tracking model to detect and track people from the supplied image frame.

FairMOT is based on the anchor-free object detector CenterNet with modifications to balance the importance between detection and re-identification tasks in an object tracker.

Inputs

img (numpy.ndarray): A NumPy array of shape \((height, width, channels)\) containing the image data in BGR format.

Outputs

bboxes (numpy.ndarray): A NumPy array of shape \((N, 4)\) containing normalized bounding box coordinates of \(N\) detected objects. Each bounding box is represented as \((x_1, y_1, x_2, y_2)\) where \((x_1, y_1)\) is the top-left corner and \((x_2, y_2)\) is the bottom-right corner. The order corresponds to bbox_labels and bbox_scores.

bbox_labels (numpy.ndarray): A NumPy array of shape \((N)\) containing strings representing the labels of detected objects. The order corresponds to bboxes and bbox_scores.

bbox_scores (numpy.ndarray): A NumPy array of shape \((N)\) containing confidence scores \([0, 1]\) of detected objects. The order corresponds to bboxes and bbox_labels.

obj_attrs (Dict[str, Any]): A dictionary of attributes associated with each bounding box, in the same order as bboxes. Different nodes that produce this obj_attrs output type may contribute different attributes. model.fairmot produces the ids attribute which contains the tracking IDs of the detections.

Configs
  • weights_parent_dir (Optional[str]) – default = null.
    Change the parent directory where weights will be stored by replacing null with an absolute path to the desired directory.

  • score_threshold (float) – default = 0.5.
    Object confidence score threshold.

  • K (int) – default = 500.
    Maximum number of objects output during the object detection stage.

  • min_box_area (int) – default = 100.
    Minimum value for area of detected bounding box. Calculated by width * height.

  • track_buffer (int) – default = 30.
    Threshold to remove track if track is lost for more frames than value.

  • input_size (List[int]) – default = [864, 480].
    Size (width, height) of the input image to the model. Raw video/image frames will be resized to the input_size before they are fed to the model.

References

FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking https://arxiv.org/abs/2004.01888

Model weights trained by: https://github.com/ifzhang/FairMOT