model.mtcnn
Description
🔲 Multi-task Cascaded Convolutional Networks for face detection. Works best with unmasked faces.
- class Node(config=None, **kwargs)[source]
Initializes and uses the MTCNN model to infer bboxes from an image frame.
The MTCNN node is a single-class model capable of detecting human faces. To a certain extent, it is also capable of detecting bounding boxes around faces with face masks (e.g. surgical masks).
- Inputs
img
(numpy.ndarray
): A NumPy array of shape \((height, width, channels)\) containing the image data in BGR format.- Outputs
bboxes
(numpy.ndarray
): A NumPy array of shape \((N, 4)\) containing normalized bounding box coordinates of \(N\) detected objects. Each bounding box is represented as \((x_1, y_1, x_2, y_2)\) where \((x_1, y_1)\) is the top-left corner and \((x_2, y_2)\) is the bottom-right corner. The order corresponds to bbox_labels and bbox_scores.bbox_scores
(numpy.ndarray
): A NumPy array of shape \((N)\) containing confidence scores \([0, 1]\) of detected objects. The order corresponds to bboxes and bbox_labels.bbox_labels
(numpy.ndarray
): A NumPy array of shape \((N)\) containing strings representing the labels of detected objects. The order corresponds to bboxes and bbox_scores.- Configs
weights_parent_dir (
Optional[str]
) – default = null.
Change the parent directory where weights will be stored by replacingnull
with an absolute path to the desired directory.min_size (
int
) – default = 40.
Minimum height and width of face in pixels to be detected.scale_factor (
float
) – [0, 1], default = 0.709.
Scale factor to create the image pyramid. A larger scale factor produces more accurate detections at the expense of inference speed.network_thresholds (
List[float]
) – [0, 1], default = [0.6, 0.7, 0.7].
Threshold values for the Proposal Network (P-Net), Refine Network (R-Net) and Output Network (O-Net) in the MTCNN model.Calibration is performed at each stage in which bounding boxes with confidence scores less than the specified threshold are discarded.
score_threshold (
float
) – [0, 1], default = 0.7.
Bounding boxes with confidence scores less than the specified threshold in the final output are discarded.
References
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks: https://arxiv.org/ftp/arxiv/papers/1604/1604.02878.pdf
Model weights trained by https://github.com/blaueck/tf-mtcnn
Changed in version 1.2.0:
mtcnn_min_size
is renamed tomin_size
.
mtcnn_factor
is renamed toscale_factor
.
mtcnn_thresholds
is renamed tonetwork_thresholds
.
mtcnn_score
is renamed toscore_threshold
.