The dataset from the EPFL's CVLab represents a section of the CA1 hippocampus region of the brain, with a volume of 1065x2048x1536 and a voxel resolution of approximately 5x5x5nm. It's available as multipage TIF files and includes annotations for mitochondria in two sub-volumes, aimed at aiding research in accurately segmenting mitochondria and synapses, among other structures. The dataset was created to encourage data sharing and accelerate neuroscientific research.

I downloaded the dataset from the Kaggle website, which contains 4 files as follow

  • training.tif
  • training_groundtruth.tif
  • test.tif
  • test_groudtruth.tif

When using Python and file lib to load the tif file, I find that all tif files have the same size: [165, 768, 1024].


To train the model that can segment the cell from volume, I utilize the nnUnet as the framework. The nnU-Net framework is a self-adapting system for medical image segmentation that leverages U-Net-based architectures. The essence of nnU-Net is its ability to automatically adjust its parameters and configurations to suit the specific characteristics of the dataset it is applied to, without requiring manual fine-tuning by the user. Furthermore, many modern segmentation models, such as U-mamba, also use this framework as a base.

To utilize this framework, I changed the format of the dataset (from tif to uii.gz), and saved the data to a special location following The structure of dataset looks like this:

├── Dataset666_ElectronMicroscopy
│   ├── dataset.json
│   ├── imagesTr
│   │   └── EM_000_0000.nii.gz
│   ├── imagesTs
│   │   └── EM_000_0000.nii.gz
│   └── labelsTr
│       └── EM_000.nii.gz

To prepare the dataset, I use following command:

nnUNetv2_plan_and_preprocess -d 666 --verify_dataset_integrity

This operation transforms the dataset into what is actually needed during training, as follows:

└── Dataset666_ElectronMicroscopy
    ├── dataset_fingerprint.json
    ├── dataset.json
    ├── gt_segmentations
    │   └── EM_000.nii.gz
    ├── nnUNetPlans_2d
    │   ├── EM_000.npz
    │   └── EM_000.pkl
    ├── nnUNetPlans_3d_fullres
    │   ├── EM_000.npy
    │   ├── EM_000.npz
    │   ├── EM_000.pkl
    │   └── EM_000_seg.npy
    ├── nnUNetPlans_3d_lowres
    │   ├── EM_000.npz
    │   └── EM_000.pkl
    └── nnUNetPlans.json

After preparing the dataset, I used base nnUNetTrainer to train the model. (Actually, I want to use the UMamba, but the GPU memory hinders me.)

nnUNetv2_train 666 3d_fullres all

Part of the training configuration is as follows

    "batch_size": "2",
    "dataloader_train.num_processes": "12",
    "dataloader_val.num_processes": "6",
    "device": "cuda:0",
    "enable_deep_supervision": "True",
    "fold": "all",
    "folder_with_segs_from_previous_stage": "None",
    "gpu_name": "NVIDIA GeForce RTX 4090 D",
    "inference_allowed_mirroring_axes": "(0, 1, 2)",
    "initial_lr": "0.01",
    "network": "PlainConvUNet",
    "num_epochs": "100",
    "num_input_channels": "1",
    "num_iterations_per_epoch": "250",
    "num_val_iterations_per_epoch": "20",
    "optimizer": "SGD",
    "oversample_foreground_percent": "0.33",
    "save_every": "10",
    "torch_version": "2.2.2+cu121",
    "unpack_dataset": "True",
    "was_initialized": "True",
    "weight_decay": "3e-05"


After training, the loss curve looks like:

Training loss

Using the test dataset to predict:

 nnUNetv2_predict -i ./data/nnUNet_raw/Dataset666_ElectronMicroscopy/imagesTs/  -o ./output -d 666 -c 3d_fullres -f all --disable_tta

Finally, the results of the test dataset look like this:



[1] F. Isensee et al., ‘nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation’, CoRR, vol. abs/1809.10486, 2018.