model.posenet

Description

🕺 Fast Pose Estimation model.

class Node(config=None, **kwargs)[source]

Initializes a PoseNet model to detect human poses from an image.

The PoseNet node is capable of detecting multiple human figures simultaneously per inference and for each detected human figure, 17 keypoints are estimated. The keypoint indices table can be found here.

Inputs

img (numpy.ndarray): A NumPy array of shape \((height, width, channels)\) containing the image data in BGR format.

Outputs

bboxes (numpy.ndarray): A NumPy array of shape \((N, 4)\) containing normalized bounding box coordinates of \(N\) detected objects. Each bounding box is represented as \((x_1, y_1, x_2, y_2)\) where \((x_1, y_1)\) is the top-left corner and \((x_2, y_2)\) is the bottom-right corner. The order corresponds to bbox_labels and bbox_scores.

keypoints (numpy.ndarray): A NumPy array of shape \((N, K, 2)\) containing the \((x, y)\) coordinates of detected poses where \(N\) is the number of detected poses, and \(K\) is the number of individual keypoints. Keypoints with low confidence scores (below threshold) will be replaced by -1.

keypoint_scores (numpy.ndarray): A NumPy array of shape \((N, K)\) containing the confidence scores of detected poses where \(N\) is the number of detected poses and \(K\) is the number of individual keypoints. The confidence score has a range of \([0, 1]\).

keypoint_conns (numpy.ndarray): A NumPy array of shape \((N, D_n', 2, 2)\) containing the \((x, y)\) coordinates of adjacent keypoint pairs where \(N\) is the number of detected poses, and \(D_n'\) is the number of valid keypoint pairs for the the \(n\)-th pose where both keypoints are detected.

bbox_labels (numpy.ndarray): A NumPy array of shape \((N)\) containing strings representing the labels of detected objects. The order corresponds to bboxes and bbox_scores.

Configs
  • model_type (Union[str, int]) – {“resnet”, 50, 75, 100}, default=”resnet”.
    Defines the backbone model for PoseNet.

  • weights_parent_dir (Optional[str]) – default = null.
    Change the parent directory where weights will be stored by replacing null with an absolute path to the desired directory.

  • resolution (Dict) – default = { height: 225, width: 225 }.
    Resolution of input array to PoseNet model.

  • max_pose_detection (int) – default = 10.
    Maximum number of poses to be detected.

  • score_threshold (float) – [0, 1], default = 0.4.
    Detected keypoints confidence score threshold, only keypoints above threshold will be kept in output.

References

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model: https://arxiv.org/abs/1803.08225

Code adapted from https://github.com/rwightman/posenet-python