The CellMap Segmentation Challenge presents a richly detailed collection featuring 289 meticulously annotated training volumes, each vetted by multiple experienced annotators for use in machine learning network development. All of the data is part of the collection DOI: https://doi.org/10.25378/janelia.c.7456966 and can be explored on OpenOrganelle. This diverse dataset encompasses 22 eFIB-SEM datasets and over 40 unique classes of organelles, providing a wide array of cellular structures for analysis. In the sections below are a summary of the datasets. If you want to learn more about all of the data included, please refer to the collection DOI write-up.
The challenge offers participants the opportunity to advance AI-driven image segmentation by leveraging detailed and accurate annotations across a broad spectrum of organelle classes. This allows participants to develop and refine their models with high-quality data while creating algorithms that are adaptable and effective in various biological contexts and imaging conditions. With such a vast and well-annotated dataset, the challenge sets the stage for significant breakthroughs in electron microscopy image analysis, potentially transforming our understanding of cellular architecture and function.
Participants can access the CellMap Segmentation Challenge data by following these steps:
The training volumes used in the CellMap Segmentation Challenge are derived from 22 distinct eFIB-SEM datasets, including 9 cell culture samples and 13 tissue samples. These samples come from a variety of organisms, including mouse, Drosophila, zebrafish, and human. The tissue types span key biological regions and organs such as the brain, heart, liver, and kidney, offering a broad spectrum of biological contexts for participants to explore.
The datasets were acquired using enhanced Focused Ion Beam Scanning Electron Microscopy (eFIB-SEM)1,2,3, at ultra-high resolutions of 4 nm or 8 nm isotropic voxel sizes. The datasets were prepared using different preservation techniques, including both chemical fixation and high-pressure freezing. These varying preparation methods introduce additional complexity, requiring participants to develop segmentation algorithms capable of handling subtle differences in sample preparation and imaging quality.
Participants can access the data through the official GitHub repository, which includes scripts and instructions for downloading the datasets, training models, making predictions, and submitting results. The dataset is publicly available through the collection DOI: https://doi.org/10.25378/janelia.c.7456966, which should be cited in any related work.
The datasets in the CellMap Segmentation Challenge include nearly 80 distinct organelle and substructure classes, representing a wide variety of cellular components. The annotations are an extension of work started by the COSEM Project Team4. Each of the 289 training volumes has been carefully annotated by a team of experts, ensuring high accuracy and consistency5. These dense annotations span a range of cellular structures, from larger, easily recognizable organelles to intricate subcellular details, providing participants with a robust and diverse dataset for segmentation.
Among these 80 classes, 47 are well-represented, each appearing in at least three volumes from three different datasets. These 47 classes form the core set for evaluation and leaderboard placement. The comprehensive range of annotated organelles and substructures encourages participants to develop advanced segmentation algorithms capable of handling both common and less frequently observed cellular features.
A more detailed technical write-up will be available in early 2025 providing an in-depth analysis of annotator variability. This analysis highlights the inherent challenges in annotating complex cellular structures, offering valuable insights into the nuances of the dataset and aiding in its interpretation.
Explore the datasets included in the CellMap Segmentation Challenge by selecting any of the links below. Each link opens a corresponding dataset in Neuroglancer, enabling interactive exploration of the annotated volumes.
Use these interactive links to inspect high-resolution FIB-SEM data, examine ground-truth annotations, and gain deeper insights into diverse cellular structures. Whether you're building segmentation models or conducting exploratory analysis, these datasets provide a unique opportunity to engage with cutting-edge biological imaging data.
Collection | View Data |
---|---|
jrc_cos7-1a | View in Neuroglancer |
jrc_cos7-1b | View in Neuroglancer |
jrc_ctl-id8-1 | View in Neuroglancer |
jrc_fly-mb-1a | View in Neuroglancer |
jrc_fly-vnc-1 | View in Neuroglancer |
jrc_hela-2 | View in Neuroglancer |
jrc_hela-3 | View in Neuroglancer |
jrc_jurkat-1 | View in Neuroglancer |
jrc_macrophage-2 | View in Neuroglancer |
jrc_mus-heart-1 | View in Neuroglancer |
jrc_mus-kidney | View in Neuroglancer |
jrc_mus-kidney-3 | View in Neuroglancer |
jrc_mus-kidney-glomerulus-2 | View in Neuroglancer |
jrc_mus-liver | View in Neuroglancer |
jrc_mus-liver-3 | View in Neuroglancer |
jrc_mus-liver-zon-1 | View in Neuroglancer |
jrc_mus-liver-zon-2 | View in Neuroglancer |
jrc_mus-nacc-1 | View in Neuroglancer |
jrc_sum159-1 | View in Neuroglancer |
jrc_sum159-4 | View in Neuroglancer |
jrc_ut21-1413-003 | View in Neuroglancer |
jrc_zf-cardiac-1 | View in Neuroglancer |
David Ackerman, Emma Avetissian, Davis Bennett, Marley Bryant, Grace Park, Alyson Petruncio, Alannah Post, Jacquelyn Price, Diana Ramirez, Jeff Rhoades, Rebecca Vorimo, Aubrey Weigel, Marwan Zouinkhi, and Yurii Zubov
Misha Ahrens, Christopher Beck, Teng-Leong Chew, Daniel Feliciano, Jan Funke, Harald Hess, Wyatt Korff, Jennifer Lippincott-Schwartz, Zhe J. Liu, Kayvon Pedram, Stephan Preibisch, Stephan Saalfeld, Ronald Vale, and Aubrey Weigel