Evaluation Metrics and Submission


The evaluation is performed by comparing the ground truth mask with the predictions submitted. Two most popular evaluation criterion families are included: volume-based and surface-based.

Volume-based metrics includes DiceCoefficient, JaccardCoefficient, and VolumeSimilarity. These metrics are implemented by SimpleITK.LabelOverlapMeasuresImageFilter()

Surface-based metrics includes AverageSurfaceDistance (both GT to Pred and Pred to GT), Hausdorff, and SurfaceDiceAt1mm. The implementation of these metrics are adopted from MSD Challenge.