Skip to content

User Guide

Introduction

das-FaceBond (DFB) software allows to perform identification and clustering operations on large galleries. The identification consists in the comparison of an image in a gallery populated with faces of different subjects. On the other hand, the clustering operation consists of searching for groups of faces of images with high similarity that potentially belong to the same identity.

Terminology

Technical terms that could be difficult to understand or confusing for users and readers are described in the glossary.

Considerations

Currently, the system is limited to work with galleries of up to 10.000.000 faces each.

Using das-FaceBond

User Roles

The system allows to create users in two different roles:

  • admin: users with this role belong to user group ‘admin’, and they are able to execute any of the endpoints of this API. * Supervisors (people who will review the job of agents) will play the same role as admins.
  • agent: users with this role belong to user group ‘agent’, and they are limited to endpoints related with identification review process.

New users may be created using the admin interface deployed in the nginx gateway, under the path /admin. It is possible to create new superusers by running the manage.py script or by using the nginx gateway to admin interface. Superusers have both roles, ‘admin’ and ‘agent’.

Image Upload

Images should be uploaded before adding a face to a gallery, and before running an identification operation. When adding a new face, the uploaded image UUID should be passed to the face creation endpoint. When running a new identification, the uploaded image UUID should be passed to the identification endpoint.

Package Upload

A package with a bunch of images may be uploaded for population of galleries. This way, the client could insert a large number of faces in the gallery, just by uploading the package and using the returned UUID as input of the face batch operation endpoint.

Galleries may be created and populated on demand. Any gallery contains a set of faces, no consideration about the identity of the faces exists in the system, beyond a custom metadata associated with each face. Galleries are totally disjoint, face sharing is not allowed. For gallery creation, just a name, a description and an operation mode are required. During gallery creation a query to das-Face service will be performed, selecting the last biometric model for future faces addition.

A gallery may be populated following two procedures:

  • Adding faces one by one. An image upload will be required, and then, a new face association may be done.
  • Adding faces on batch. A package upload will be required, and then, all face images contained in the package will be inserted as images and associated with the gallery.

During any of both processes, das-Face will be used to generate face embeddings using the model annotated when the gallery was created.

Identification (1:N)

An identification requires two components: a probe image (containing a face); and a target gallery. The probe image should be uploaded first, and then the operation will be requested. The operation comprises the comparison of the probe versus all matching candidates in the target gallery. The list of matching candidates may be filtered by a minimum match threshold, and it may be filtered based on locality algorithms for performance reasons.

Batch of Identifications

A batch of identifications requires two components: a gallery with probe faces; and a target gallery. All images, taken from the faces of the probe gallery, will be compared to all the faces in the target gallery. Each one of the identifications will follow the same procedure as described at A.5.

Gallery faces may be clustered together in natural groups based on similarity between them. This operation is useful to look for similar faces, or a person with more than one face in the gallery. This operation requires a target gallery, a threshold for minimum similarity, and an expected accuracy indication. Depending on the accuracy, the operation may be faster, and less accurate, or slower but more accurate.

Review of identifications

The das-FaceBond system comes with an implementation which facilitates the manual review[^11] of identification operations by a human agent. This review process annotates identifications with three label types:

  • Review decision state: It is indicated by the human agent and it is stored in the database. This decision can be: * NOT_OPERATIONAL: when the identification operation is running, or it has failed for some reason. * PENDING: when the identification operation has finished and it is ready to be reviewed by a human agent. * DONE_CONCLUSIVE: when the human agent has annotated the operation and he was sure about his decisions. * DONE_DOUBTFUL: when the human agent has annotated the operation but he has doubts about his decisions.
  • Candidates: It is a list of faces marked as matches with the probe image used in the identification.
  • Observations: A free text indicating any kind of commentary given by the human agent.

Pending identification operations will be served in FIFO order to be reviewed by human agents. To avoid multiple agents reviewing the same operation, the time required to review the identification is indicated at variable IDENTIFICATION_ANNOTATIONS_LABOR_TIME, in seconds, and this time a served operation will be kept frozen in the queue.

The Figure 1 shows an automaton of transitions between review decision states. The system will validate the indicated annotations to be consistent with the automaton valid transitions.

Cipher vectors

Biometric fingerprints are stored on the database and can be encrypted using cryptography algorithms. AES block cipher (CBC mode) is currently implemented on das-FaceBond. This encryption mode uses 128 block size and 128/192/256/512-bit keys which are given as an environment variable FACE_EMBEDDINGS_CIPHER_KEY. Using ciphered vectors means an increase in the times of the product, during the population, and the loading of the database. The time required to populate the database is multiplied by 4 and the time needed to load the gallery to memory on the restart of the product is multiplied by 7. The time needed for each search is not affected by the ciphering of the vectors. If you want to apply ciphering to an already existing database, you need to delete the database and create it again from scratch. In order to use ciphered vectors, a Cipher Key must be provided. For more information about this, please refer to Onpremise installation.

Data format considerations

The das-FaceBond system admits to upload data in form of independent images or packed in ZIP, TAR or TAR+GZIP packages (with mime types: application/x-tar, application/x-compressed-tar, application/zip, application/gzip).

Images must be PNG or JPEG (with mime types: image/jpeg, image/png). It should be considered that images are JPEG and with a mean size of 100KB per image file in order to fulfill previously stated hardware requirements. PNG files are allowed, but they are not recommended because of their larger disk space requirements. Images are recommended to be around 1000 pixels in width.

Images should be sent in a straight position, that is, showing the face from top (head) to bottom (chin) (The system allows to rotate images in order to look for faces, but the activation of this feature means up to 4 times more computation when populating the galleries). Faces should be ⅛th part of the total size of the image (This limitation can be relaxed by a parameter of the system. But changing this parameter from its default value requires more computational power when populating the galleries).

When populating the galleries with image packages (ZIP or TAR), they must be of at most 500MB in size and not to include more than 20.000 images. Each image is recommended to have a filename which encodes a sort of identifier (e.g., customer identifier or similar). The system will register the following metadata fields per each image file found in the package:

  • name: A field containing the whole path of the image in the package, including all directories.
  • basename: Contains the filename of the image, removing all directory names from the path.
  • package: The filename of the package where the image was uploaded into the system.