Researchers from China Propose a Data Augmentation Approach CarveMix for Brain Lesion Segmentation

Automated brain lesion segmentation using convolutional neural networks (CNNs) has become a valuable clinical diagnosis and research tool. However, CNN-based approaches still face challenges in accurately segmenting brain lesions due to the scarcity of annotated training data. Data augmentation strategies that mix pairs of annotated images have been developed to improve the training of CNNs. However, existing methods based on image mixing are not designed for brain lesions and may not perform well for brain lesion segmentation. 

Before using CNN-based approaches, previous studies on automated brain lesion segmentation relied on traditional machine-learning techniques. Recent developments in CNNs have resulted in substantial enhancements in segmentation performance. Examples of these recent developments include 3D DenseNet, U-Net, Context-Aware Network (CANet), and uncertainty-aware CNN, which have been proposed for segmenting various types of brain lesions. However, despite these advancements, accurately segmenting brain lesions remains challenging.

Thus, a research team from China recently proposed a simple and effective data augmentation approach called CarveMix, which is lesion-aware and preserves the lesion information during image combination.

CarveMix, a data augmentation approach, is lesion-aware and designed specifically for CNN-based brain lesion segmentation. It stochastically combines two annotated images to obtain new labeled samples. CarveMix carves a region of interest (ROI) from one annotated image according to the lesion location and geometry with a variable ROI size. The carved ROI then replaces the corresponding voxels in a second annotated image to synthesize new labeled images for network training. The method also applies additional harmonization steps for heterogeneous data from different sources and models the mass effect unique to whole brain tumor segmentation during image mixing.

Concretely, the main steps of the proposed approach for brain lesion segmentation are the following:

Authors use a set of 3D annotated images with brain lesions to train a CNN for automated brain lesion segmentation.

From the annotated images, the data augmentation is performed using CarveMix, which is based on lesion-aware image mixing.

To perform image mixing, the authors take an annotated image pair and extract a 3D ROI from one image according to the lesion location and geometry gave by the annotation.

Then the ROI is mixed with the other image, replacing the corresponding region, and adjust the annotation accordingly.

Finally, synthetic annotated images and annotations are obtained that can be used to improve the network training. The authors repeat the process to generate diverse annotated training data.

The proposed method was evaluated on several datasets for brain lesion segmentation and compared to traditional data augmentation (TDA), Mixup, and CutMix. Results show that CarveMix+TDA outperformed the competing methods regarding Dice coefficient, Hausdorff distance, precision, and recall. The proposed method reduced false negative predictions and under-segmentation of lesions. The benefit of CarveMix alone without online TDA was also shown.

In this article, we presented a new approach named CarveMix which was proposed as a data augmentation technique for brain lesion segmentation. CarveMix is a combination of annotated training images that creates synthetic training images. This combination is lesion-aware, taking into account the location and shape of the lesions with a randomly sampled size parameter. To ensure consistency in the combination of data from different sources, harmonization steps are introduced. Additionally, mass effect modeling is incorporated to improve CarveMix specifically for whole brain tumor segmentation. The experimental results of four brain lesion segmentation tasks show that CarveMix improves accuracy and outperforms other data augmentation strategies.

Check out the Paper. Don’t forget to join our 26k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at

🚀 Check Out 100’s AI Tools in AI Tools Club

Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor's degree in physical science and a master's degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...