model.mtcnn
Description
🔲 Multi-task Cascaded Convolutional Networks for face detection. Works best with unmasked faces.
- class Node(config=None, **kwargs)[source]
Initializes and uses the MTCNN model to infer bboxes from an image frame.
The MTCNN node is a single-class model capable of detecting human faces. To a certain extent, it is also capable of detecting bounding boxes around faces with face masks (e.g. surgical masks).
- Inputs
img(numpy.ndarray): A NumPy array of shape \((height, width, channels)\) containing the image data in BGR format.- Outputs
bboxes(numpy.ndarray): A NumPy array of shape \((N, 4)\) containing normalized bounding box coordinates of \(N\) detected objects. Each bounding box is represented as \((x_1, y_1, x_2, y_2)\) where \((x_1, y_1)\) is the top-left corner and \((x_2, y_2)\) is the bottom-right corner. The order corresponds to bbox_labels and bbox_scores.bbox_scores(numpy.ndarray): A NumPy array of shape \((N)\) containing confidence scores \([0, 1]\) of detected objects. The order corresponds to bboxes and bbox_labels.bbox_labels(numpy.ndarray): A NumPy array of shape \((N)\) containing strings representing the labels of detected objects. The order corresponds to bboxes and bbox_scores.- Configs
weights_parent_dir (
Optional[str]) – default = null.
Change the parent directory where weights will be stored by replacingnullwith an absolute path to the desired directory.min_size (
int) – default = 40.
Minimum height and width of face in pixels to be detected.scale_factor (
float) – [0, 1], default = 0.709.
Scale factor to create the image pyramid. A larger scale factor produces more accurate detections at the expense of inference speed.network_thresholds (
List[float]) – [0, 1], default = [0.6, 0.7, 0.7].
Threshold values for the Proposal Network (P-Net), Refine Network (R-Net) and Output Network (O-Net) in the MTCNN model.Calibration is performed at each stage in which bounding boxes with confidence scores less than the specified threshold are discarded.
score_threshold (
float) – [0, 1], default = 0.7.
Bounding boxes with confidence scores less than the specified threshold in the final output are discarded.
References
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks: https://arxiv.org/ftp/arxiv/papers/1604/1604.02878.pdf
Model weights trained by https://github.com/blaueck/tf-mtcnn
Changed in version 1.2.0:
mtcnn_min_sizeis renamed tomin_size.
mtcnn_factoris renamed toscale_factor.
mtcnn_thresholdsis renamed tonetwork_thresholds.
mtcnn_scoreis renamed toscore_threshold.