Results


	Results

As of 30 November 2003, 33 teams registered for the competition, with most of them from academia and a few from industry. Of the 33 teams registered, 16 teams eventually submitted their programs for Task 1 while 13 teams for Task 2 by the deadline (31 December 2003). Some teams entered submissions for both tasks. One team, which participated in both tasks, submitted a program that requires a licensed software to run it. Eventually this team withdrew from the competition. So we ended up having a total of 15 teams for Task 1 and 12 teams for Task 2. All teams are from academia from nine different countries (Australia, China, France, Germany, Korea, Singapore, Spain, Turkey, and United States).

The testing protocol is as follows. Each program was evaluated on two signature data sets. The first data set, called the training set, consists of genuine signatures and skilled forgeries for 40 users. It was released to the participants for system building and evaluation prior to program submission. The second set, called the test set, consists of genuine signatures and skilled forgeries for 60 users. This set was not released to the participants. For each user from either the training set or the test set, 10 trials were run based on 10 different random subsets of five genuine signatures each from files S01-S10 for enrollment. After each enrollment trial, the program was evaluated on 10 genuine signatures (S11-S20), 20 skilled forgeries (S21-S40), and 20 random forgeries selected randomly from genuine signatures of 20 other users. Whenever randomness was involved, the same random sets were used for all teams for fairness. We report the equal error rates (EER) and ROC curves separately for skilled forgeries and random forgeries.

The programs of some teams encountered problems during the evaluation process. In particular, they failed to report similarity scores for some input signatures during the training and/or testing phase. For fairness of comparison, EER statistics and ROC curves are not reported for these programs. Besides reporting the average EER over all users and all 10 trials for each team, we also report the standard deviation, maximum and minimum EER values. For both tasks, Team 6 (i.e., 106 and 206), from the Sabanci University of Turkey, gives the lowest average EER values when tested with skilled forgeries.

EER statistics

Task 1
Task 2

ROC curves

Task 1
Task 2