Skip to main content

Diarisation

Speaker Diarisation answers the question "who spoke when"?
- For a corpus of audio, determine which clips of audio belong to the same speaker.
Face Diarisation performs a similar task for faces
- For a corpus of video/images, group faces based on identities.

Face Diarisation Pipeline

For all frames to process

Run face detection
Extract cropped faces
Encode cropped faces to obtain an embedding to represent the face

Finally, cluster embeddings.

Vocabulary

noun

A collection of writings, often on a specific topic, of a specific genre, from a specific demographic or a particular author, etc.
(specifically) Such a collection in form of an electronic database used for linguistic analyses.
A body, a collection.

References

QUT Week9 Materials

Face Diarisation Pipeline
Vocabulary
References