Misusing Hausdorff's Distance

Adrien Foucart, PhD in biomedical engineering.

Get sparse and irregular email updates by subscribing to https://adfoucart.substack.com. This website is guaranteed 100% human-written, ad-free and tracker-free.

Hausdorff’s Distance is useful – but it can be tricky

I like Hausdorff’s Distance as a segmentation metric. It’s a way to measure the distance between the contours of two objects, and provides useful information that isn’t captured by overal metrics such as the Intersection over Union or the Dice Similarity Coefficient. I talk about it way more extensively in my thesis.

What I want to focus on here, however, is how it’s implemented, and particularly how scikit-image’s implementation leads to some mistakes in the way it’s commonly used.

The example from scikit-image’s documentation shows how the metric is computed for two sets of points. This, however, is not how it would generally be used in a segmentation task.

In a segmentation task, we will generally be comparing two binary masks: one with the annotated obect, and one with the prediction, as shown here:

Example of synthetic “ground truth” and “predicted” segmentation masks, as well as the overlapping “contours image,” with the pair of points determining Hausdorff’s distance marked in white.

Looking at the hausdorff_distance method in scikit-image, we can see that it expects as arguments two images. The documentation states that it will compute the Hausdorff distance between nonzero elements of given images, which means that if we want to use it as a segmentation metric, we need to provide as input images where the contours of the objects are nonzero elements.

This, however, is a bit confusing for two reasons. First, most segmentation metrics are computed directly on the segmentation masks, so users of the library who don’t read the documentation closely enough could expect that behaviour to be implemented here. Second, scikit-image provides no method for computing a “contours image” from a segmentation mask. The find_contours method returns a list of contour points coordinates, not an image. It also uses some interpolation to compute the coordinates, meaning that we get floating points values which can’t be used to directly recreate a “contours image” from the output of find_contours.

The easiest way to get the intended results of the method is to do something like:

se = np.array([[0, 1, 0], [1, 1, 1], [0, 1, 0]])
gt_contour = ground_truth ^ erosion(ground_truth, se)
predicted_contour = predicted ^ erosion(predicted, se)
distance = hausdorff_distance(gt_contour, predicted_contour)

Where ground_truth and predicted are binary segmentation masks. The contours here are found by eroding the masks and subtracting the eroded mask from the original (thus only leaving the outer layer of pixels, as shown in the previous figure).

Does it matter?

The confusion about how to properly compute Hausdorff’s distance is not limited to scikit-image. The 2015 Gland Segmentation challenge, for instance, describes in its post-challenge publication (Sirinukunwattana et al., 2017) the metric as “the most extreme value from all distances between the pairs of nearest pixels on the boundaries of S and G,” but in the mathematical definition on the challenge website – and in their MATLAB implementation – actually compute the distances not just on the boundaries, but on all nonzero pixels in the segmentation masks, thus replicating scikit-image’s behaviour.

In code available from recent publications using scikit-image’s implementation such as Bourigault et al., MICCAI 21 (code) or Le Bescond et al., MICCAI 22 (code), we see the same implementation, where all pixels of the segmentation mask are considered instead of just the contours.

There can be a significant difference between the two ways of measuring the distance, as illustrated in the synthetic example below:

Example of synthetic “ground truth” and “predicted” segmentation masks, as well as the overlapping “contours image,” with the pair of points determining Hausdorff’s distance marked in white (using the points in the contours only) and black (using all points in the binary masks).

From a quick search on GitHub, I have found other examples of “misuse” of scikit-image’s confusing implementation.

So what’s next?

I have raised an issue on scikit-image’s repository. To avoid backward compatibility issues, I don’t think it’s a good idea to change the behaviour of the method in scikit-image, but I think using a clearer example of how to use it, being more explicit in the docstring and/or add a function to directly compute the distance from the segmentation masks would be interesting improvements.

I’ll see whether the scikit-image community agrees with me (or cares at all about the issues, to start with), and hopefully we can limit the confusion in the future.