model.hrnet

Description

🕺 High-Resolution Network: Deep high-resolution representation learning for human pose estimation. Requires an object detector.

class Node(config=None, **kwargs)[source]

Initializes and uses HRNet model to infer poses from detected bboxes. Note that HRNet must be used in conjunction with an object detector applied prior.

The HRNet applied to human pose estimation uses the representation head, called HRNetV1.

The HRNet node is capable of detecting single human figures simultaneously per inference, with 17 keypoints estimated for each detected human figure. The keypoint indices table can be found here.

Inputs

img (numpy.ndarray): A NumPy array of shape \((height, width, channels)\) containing the image data in BGR format.

bboxes (numpy.ndarray): A NumPy array of shape \((N, 4)\) containing normalized bounding box coordinates of \(N\) detected objects. Each bounding box is represented as \((x_1, y_1, x_2, y_2)\) where \((x_1, y_1)\) is the top-left corner and \((x_2, y_2)\) is the bottom-right corner. The order corresponds to bbox_labels and bbox_scores.

Outputs

keypoints (numpy.ndarray): A NumPy array of shape \((N, K, 2)\) containing the \((x, y)\) coordinates of detected poses where \(N\) is the number of detected poses, and \(K\) is the number of individual keypoints. Keypoints with low confidence scores (below threshold) will be replaced by -1.

keypoint_scores (numpy.ndarray): A NumPy array of shape \((N, K)\) containing the confidence scores of detected poses where \(N\) is the number of detected poses and \(K\) is the number of individual keypoints. The confidence score has a range of \([0, 1]\).

keypoint_conns (numpy.ndarray): A NumPy array of shape \((N, D_n', 2, 2)\) containing the \((x, y)\) coordinates of adjacent keypoint pairs where \(N\) is the number of detected poses, and \(D_n'\) is the number of valid keypoint pairs for the the \(n\)-th pose where both keypoints are detected.

Configs
  • weights_parent_dir (Optional[str]) – default = null.
    Change the parent directory where weights will be stored by replacing null with an absolute path to the desired directory.

  • resolution (Dict[str, int]) – default = { height: 192, width: 256 }.
    Resolution of input array to HRNet model.

  • score_threshold (float) – [0, 1], default = 0.1.
    Threshold to determine if detection should be returned

References

Deep High-Resolution Representation Learning for Visual Recognition: https://arxiv.org/abs/1908.07919