Automatic Speaker Verification Challenge

Organized by vuhl - Current server time: April 23, 2025, 1:49 a.m. UTC

Current

Public Test
April 1, 2025, midnight UTC

Next

Private Test
Sept. 1, 2025, midnight UTC

End

Competition Ends
Dec. 31, 2025, midnight UTC

Automatic Speaker Verification Challenge - Overview

Automatic Speaker Verification Challenge

Overview

Welcome to the Automatic Speaker Verification (ASV) Challenge! This competition focuses on developing and evaluating speaker verification systems that can accurately determine whether two speech utterances belong to the same speaker.

Challenge Description

Speaker verification is a biometric authentication technology that verifies a person's claimed identity based on their voice characteristics. It answers the question: "Is this the person they claim to be?"

In this challenge, participants will develop systems that:

  • Process pairs of speech utterances
  • Generate a similarity score for each pair
  • The higher the score, the more likely the utterances are from the same speaker

Dataset: Vietnam-Celeb

This challenge uses the Vietnam-Celeb dataset, which is a large-scale spontaneous dataset gathered under noisy environments, with over 87,000 utterances from 1,000 Vietnamese speakers of many professions, covering 3 main Vietnamese dialects.

Key features of the dataset:

  • 87,000+ utterances
  • 1,000 Vietnamese speakers
  • Coverage of 3 main Vietnamese dialects
  • Recorded in noisy environments
  • Spontaneous speech
  • Speakers from diverse professional backgrounds

Challenge Phases

  1. Public Test Phase: Participants submit their predictions on the public test set to get feedback.
  2. Private Test Phase: Final evaluation on the private test set to determine the winners.

Evaluation

Systems will be evaluated using the Equal Error Rate (EER) metric, which is the operating point where the False Acceptance Rate (FAR) equals the False Rejection Rate (FRR). Lower EER values indicate better performance.

Contact

For questions or support, please use the competition forum or contact the organizers at [email protected], cc-ing [email protected], [email protected].

Automatic Speaker Verification Challenge - Evaluation

Evaluation Criteria

Metric: Equal Error Rate (EER)

The primary evaluation metric for this challenge is the Equal Error Rate (EER), expressed as a percentage (%). EER is the operating point where the False Acceptance Rate (FAR) equals the False Rejection Rate (FRR).

Understanding EER

In speaker verification:

  • False Acceptance Rate (FAR): The percentage of impostor trials incorrectly accepted as genuine speakers.
  • False Rejection Rate (FRR): The percentage of genuine speaker trials incorrectly rejected.

The EER is the point where FAR = FRR. Lower EER values indicate better performance. A perfect system would have an EER of 0%, while a random classifier would have an EER around 50%.

We use the following implementation to calculate EER: https://github.com/wenet-e2e/wespeaker/blob/310a15850895b54e20845e107b54c9a275d39a2d/wespeaker/utils/score_metrics.py#L79

Submission Format

Your submission should be a ZIP file, containing a text file named predictions.txt, each line is the similarity score for one trial pair. Please note that the order of the lines should exactly match the order of the trials in the provided test list. For example, if the first line of the test list is "audio1.wav audio2.wav", the first line of your predictions.txt should be the similarity score for that pair.

Evaluation Process

The evaluation process works as follows:

  1. Your submitted similarity scores are compared with the ground truth labels (same speaker = 1, different speaker = 0).
  2. A ROC curve is plotted using your scores.
  3. The EER is calculated as the point where FAR = FRR on this curve.
  4. The EER percentage is reported on the leaderboard (lower is better).

Example

For example, if your system produces these similarity scores:

0.75
0.23
0.92

And the ground truth is:

test_001    enroll_005    1    # Same speaker
test_002    enroll_008    0    # Different speakers
test_003    enroll_003    1    # Same speaker

Then your system would be performing well because it assigned higher scores to same-speaker pairs and lower scores to different-speaker pairs.

Ranking

Participants will be ranked based on their EER score, with lower values being better. In case of ties, earlier submissions will be ranked higher.

Automatic Speaker Verification Challenge - Terms and Conditions

Terms and Conditions

Participation Rules

  1. Participation in this challenge is open to individuals and teams worldwide.
  2. Teams can consist of up to 5 members.
  3. Each participant may be a member of only one team.
  4. Participants must register for the challenge before making submissions.
  5. The organizers reserve the right to disqualify any participant who violates these terms or engages in unethical behavior.

Submission Guidelines

  1. All submissions must be made through the challenge platform.
  2. Participants are limited to the maximum number of submissions specified for each phase.
  3. Submissions must follow the required format as described in the evaluation guidelines.
  4. Participants must not attempt to reverse-engineer the test set or ground truth data.
  5. Manual labeling of test data is strictly prohibited.

Data Usage

  1. The dataset provided for this challenge may only be used for participating in this competition.
  2. Participants are allowed to use additional external data for training their models, but this must be clearly documented.
  3. Redistribution of the challenge dataset is strictly prohibited.
  4. After the competition, participants may use the dataset for research purposes and must cite the dataset appropriately.

Intellectual Property

  1. Participants retain ownership of their submissions and the intellectual property rights to their methods.
  2. By submitting to the challenge, participants grant the organizers a non-exclusive, worldwide, royalty-free license to use their submissions for evaluating and presenting the challenge results.
  3. Participants agree that the organizers may publish their team name, member names, and performance results.

Publication and Recognition

  1. The organizers plan to publish a summary of the challenge results, including the methods used by top-performing teams.
  2. Top-performing teams may be invited to present their methods at a related workshop or conference.
  3. Participants are encouraged to publish their methods, citing the challenge appropriately.

Privacy

  1. Personal information provided during registration will be used solely for the purposes of the challenge and will not be shared with third parties.
  2. Participants' names and affiliations may be published on the challenge leaderboard and in challenge-related publications.

Disclaimer

The challenge organizers reserve the right to modify these terms and conditions at any time. Participants will be notified of any changes. The decisions of the challenge organizers regarding any aspect of the competition are final.

Contact

For questions or clarifications regarding these terms, please contact the challenge organizers at [email protected], cc-ing [email protected], [email protected].

Q: Why is my result showing 50% EER?

A: By default, if the function to calculate EER fails to run, it will return the result of 50% EER (which is similar to random guessing). You can check the output log and error log of your submission in the submission window to debug. Please make sure that:

- The score matches the order of the pairs provided in the test list

- The file name is exactly "predictions.txt", with an "s". The text file is zipped in an archive with any name

- Make sure the score is float, not string, and not encapsulated with any quotes.

With that being said, if your result is somewhat even worse than 50%. Congratulations! Your score is bug-free, yet hopeless result for now. Try to improve that!

 

Q: What dataset can I use for training and development?

A: You can use Vietnam-Celeb-T (provided, downloadable) as the training set with any other datasets you can find. However, you have to disclose the dataset publicly on our Forum.

For development, you can use Vietnam-Celeb-E and Vietnam-Celeb-H provided in the Vietnam-Celeb dataset. The public and private test sets in the challenge are completely different and never published before.

 

 

Public Test

Start: April 1, 2025, midnight

Private Test

Start: Sept. 1, 2025, midnight

Competition Ends

Dec. 31, 2025, midnight

You must be logged in to participate in competitions.

Sign In
# Username Score
1 vuhl 3.880