When evaluating a standard machine learning model, we usually classify our
predictions into four categories: true positives, false positives, true
negatives, and false negatives. However, for the dense prediction task of image
segmentation, it's not immediately clear what counts as a "