Region-level Evaluation Metrics for Image Segmentation

What is Image Segmentation?

Image Segmentation is a technique for detecting and dividing the images into sub-groups or segments called image segments by classifying every pixel to which the object belongs. It helps to reduce the images' complexity and analyse the images' segments.

Segmentation is a process of assigning labels to pixels to identify the object's elements in the images. It can be classified into different classes named semantic segmentation and instance segmentation. Object detection is the most common application of image segmentation, which uses an algorithm to find the image instead of processing whole images. Image segmentation is used in various fields like face recognition, autonomous vehicles (driverless cars), image analysis, etc.

Working on Image Segmentation

Image Segmentation first takes an input in the form of images and then produces the output after segmenting the images. The output of the image segmentation is the matrix with different classes of elements along with the pixels it belongs.

Image Segmentation needs features of high level and quality that use various clustering techniques like edges and histograms. Image segmentation using different machine learning techniques needs model training, which improves the accuracy and efficiency of identifying the features of the images. The most effective way to segment images is to use deep learning neural networks.

The neural networks built for image segmentation contain the following structure:

Encoder: It consists of a series of layers used for extracting image features with the help of filters. The encoder for image segmentation can also be used for similar tasks like image recognition, using previous knowledge for better accuracy.
Decoder: It consists of layers that decode and convert the input image into a segmented matrix defining the inputs' segments and the pixels.
Connections: It consists of long-range neural network connections that help to identify the features used to enhance model accuracy.

There are different techniques for image segmentation:

Edge-based segmentation
Threshold-based segmentation
Cluster-based segmentation
Watershed segmentation
Region-based segmentation

After training the image segmentation models using neural networks, we must evaluate them and check their efficiency. Different evaluation metrics evaluate the performance, including accuracy, precision, IoU, etc. Let's understand the region-based evaluation metrics for image segmentation.

Region-Level Evaluation for Image Segmentation

Region-level segmentation is a technique used for segmenting various images in different regions. It divides images into regions having similar characteristics or features. The regions of the segmented image are a group of pixels located by the algorithm via seed point. As soon as the seed point is located, more regions can be made by adding, merging, or shrinking pixels from the images.

This region-level technique evaluates the segmentation accuracy in the number of regions. A region-based evaluation of multiple segmented images can be referred to as the total number of differences between their corresponding regions.

Region-level evaluation is used to evaluate the performance of the algorithms, segmenting the images into various regions and objects. It differs from other evaluation metrics as it evaluates the accuracy by focusing on segmenting the regions instead of processing the individual parts of the image. For instance, pixel-based metrics, which focus on the pixels of the images, are challenging to handle and can give inaccurate results while evaluating the images because evaluating the smallest pixels can take time.

Let's understand the various region-level evaluation metrics.

1) Region precision and recall

Precision: The region precision is defined as the proportion of truly predicted pixels in the predicted regions. High precision represents that the predicted region has fewer false positives, which means the model has detected less amount of regions that do not exist.
Recall: The region recall is defined as the proportion of truly segmented pixels within the ground truth region. The high recall represents that the model has detected most of the regions as true positive.

2) Region intersection over union (Region IoU)

The region intersection over the union or IoU is defined as the area between the segmented region and its corresponding ground truth region. It can be calculated as the ratio of the overlapped area to the union area between the predicted and ground truth region.

3) Confusion matrix

The region-level confusion matrix creates a matrix of N x N, where N is the number of regions to be predicted or segmented.

It consists of four most important terms:

True Positive: The model predicts existing regions in the image (Correct Predicted regions).
False Positive: The model predicts the regions that do not exist in the image.
False Negative: It represents that the model cannot predict the regions present in the images.
True Negative: It consists of the unpredicted regions that do not exist.

4) Region F1 Score

It is referred to as the average rate of change of precision and recall at the region level. It is used to determine the precision of the model made for image segmentation.

5) Region Rand Index

It is also known as the accuracy score, describing the number of correct predicted regions having both true positive and negative, divided by the total number of predictions.

It can be defined by:

Region-level Evaluation Metrics for Image Segmentation

6) Region Dice Coefficient

It defines the overlap between the segmented and ground truth regions. A higher region dice coefficient yields higher accuracy.

7) Mean Absolute Error

It can be defined as the average absolute difference between the pixels in the predicted and ground truth regions. It is used to calculate errors in the pixels of the segmented image. But, it fails to tell the reason or direction of the error, if it is under-predicted or over-predicted. The mean absolute error can be calculated as:

8) Root Mean Square Error

It can be calculated as the square root of the average of the squared difference between the pixels in the predicted and the ground truth regions. It calculates the overall extent of the errors in the segmented images, resulting in large prediction errors. It can be calculated as: