Skip to main content

Diarisation

  • Speaker Diarisation answers the question "who spoke when"?
    • For a corpus of audio, determine which clips of audio belong to the same speaker.
  • Face Diarisation performs a similar task for faces
    • For a corpus of video/images, group faces based on identities.

Face Diarisation Pipeline

For all frames to process

  • Run face detection
  • Extract cropped faces
  • Encode cropped faces to obtain an embedding to represent the face

Finally, cluster embeddings.

Vocabulary

Cannot find definitions for "corpus".

References