Are they the first and first three samples from the six samples we submit for each test example? Or are they selected randomly?
Also, in the evaluation criteria, it says that " The leaderboard will be sorted by minFDE at K=6. The rankings will be based on this metric too." However, it seems that the current ranking is not strictly sorted by minFDE at K=6? Are you using minADE instead?