As of 30 November 2003, 33 teams registered for the competition,
with most of them from academia and a few from industry. Of the
33 teams registered, 16 teams eventually submitted their programs
for Task 1 while 13 teams for Task 2 by the deadline (31 December
2003). Some teams entered submissions for both tasks. One team,
which participated in both tasks, submitted a program that requires
a licensed software to run it. Eventually this team withdrew from
the competition. So we ended up having a total of 15 teams for Task
1 and 12 teams for Task 2. All teams are from academia from nine
different countries (Australia, China, France, Germany, Korea, Singapore,
Spain, Turkey, and United States).
The testing protocol is as follows. Each program was evaluated on
two signature data sets. The first data set, called the training
set, consists of genuine signatures and skilled forgeries for 40
users. It was released to the participants for system building and
evaluation prior to program submission. The second set, called the
test set, consists of genuine signatures and skilled forgeries for
60 users. This set was not released to the participants. For each
user from either the training set or the test set, 10 trials were
run based on 10 different random subsets of five genuine signatures
each from files S01-S10 for enrollment. After each enrollment trial,
the program was evaluated on 10 genuine signatures (S11-S20), 20
skilled forgeries (S21-S40), and 20 random forgeries selected randomly
from genuine signatures of 20 other users. Whenever randomness was
involved, the same random sets were used for all teams for fairness.
We report the equal error rates (EER) and ROC curves separately
for skilled forgeries and random forgeries.
The programs of some teams encountered problems during the evaluation
process. In particular, they failed to report similarity scores
for some input signatures during the training and/or testing phase.
For fairness of comparison, EER statistics and ROC curves are not
reported for these programs. Besides reporting the average EER over
all users and all 10 trials for each team, we also report the standard
deviation, maximum and minimum EER values. For both tasks, Team
6 (i.e., 106 and 206), from the Sabanci University of Turkey, gives
the lowest average EER values when tested with skilled forgeries.