Google AI Introduces ‘SegCLR,’ a Self-Supervised Machine Learning Technique that Produces Highly Informative Representations of Cells Directly from 3D Electron Microscope Imagery and Segmentations

If we can analyze the organization of neural circuits, it will play a crucial role in better understanding the process of thinking. It is where the maps come into play. Maps of the nervous system contain information about the identity of individual cells, like their type, subcellular component, and connectivity of the neurons. 

But how do we obtain these maps?

Volumetric nanometer-resolution imaging of brain tissue is a technique that provides the raw data needed to build these maps. But inferring all the relevant information is a laborious and challenging task because of the multiple scales of brain structures (e.g., nm for a synapse vs. mm for an axon). It requires hours of manual ground truth labeling by expert annotators.

However, some methods from computer vision and machine learning help in these annotations, but those are not very reliable and need proofreading of the final ground truth annotations. Moreover, tasks like cell type identification from a very small neuron fragment are challenging even for human experts.

To further automate this task and address the problems mentioned above, the authors of this paper proposed a self-supervised machine learning technique, “SegCLR,” which stands for Segmentation-Guided Contrastive Learning of Representations. SegCLR takes a 3d volume from VEM as input and produces an embedding as output.

The proposed approach is scalable in three important respects:

  1. The produced embedding can be used for different tasks like identification of cellular subcompartments, cell types, etc.
  2. The representation learned by SegCLR is very compact. It is directly usable in downstream analyses with linear classifiers or shallow networks, removing the need to train a large model repeatedly to learn representations for each new task.
  3. Moreover, SegCLR scales down the need for labeled ground truth data by order of magnitude of 4.

In addition, SegCLR enables reliable annotation of cells even from very short fragments (~10-50 μm) of cortical cells. In the end, the authors also showed that SegCLR could be combined with gaussian processes to estimate the uncertainty in the predictions.

Let’s talk about the history of neuropil annotation for a while. In the past, machine learning methods used features that were hand-designed or derived from supervised learning. A random forest classifier trained on hand-derived features and a 2d convolutional network trained on projections of neuropil or 3d convolutional trained directly on voxels.

What is SegCLR doing?

SegCLR produces embeddings: that are rich biological features in low dimensional space. The produced embeddings also have the quality of contrastive learning, which means vector distance maps to biological distinctness. Now, various downstream tasks can use these embeddings. 

SegCLR embedding represents a local 3d view of the EM data, and it is focussed on an individual cell or cell fragment within the neuropil with an accompanying segmentation.

Figure 1: The SegCLR architecture maps local, masked 3D views of electron microscopy data to embedding vectors.

A ResNet-18-based encoder is trained to produce embeddings of 64 dimensions while using a contrastive loss function as a cost function and a projection head, which further reduces the dimensions of the embedding from 64 to 16 (see fig. 1). The encoder has been trained on two publicly available EM connectomic datasets; one is from the human temporal cortex, and the other is from a mouse’s temporal cortex. Now the trained encoder and produced embeddings are used for the following downstream tasks:

  1. Cellular subcompartment classification: It involves identifying cellular subcompartments like axons, dendrites, and somas. A linear classifier trained on the embeddings of the human cortical dataset achieved an F1-Score of 0.988. While on the mouse dataset, the classification reached 0.958 mean F1-Score. The classifier matches the performance of the direct Supervised model requiring roughly 4000 times fewer labeled training examples and surpassing it when trained on full data (see fig. 2).
Figure 2: (left) Cellular subcompartment classification. (right) Evaluation of classification performance for axon, dendrite, soma, and astrocyte subcompartments in the human cortex dataset via mean F1-Score, while varying the number of training examples used.
  1. Classification of neuron and glia subtype for large and small cell fragments: It is very similar to cellular subcompartment classification, but individual SegCLR embeddings represent a local 3d view of side 4-5 micrometer only, which is not sufficient for cell typing. To counter this author proposed a technique to aggregate embedding information over larger spatial extents by collecting nearby embeddings within the radius R and then taking the mean embedding value over each feature. After various experiments, for the human dataset, the classifier achieved an F1-Score of 0.938 for R=10 μm for six classes; for the mouse dataset, the F1-Score is 0.748 for R=25 μm for 13 classes.
  2. Unsupervised data exploration: UMAP projections are used to visualize samples of embedding, and separate clusters in UMAP space are easily observed for glia versus neurons and axons, dendrites, and somas (see fig. 3)
Figure 3: SegCLR embeddings projected to 3D UMAP space, with two selected axes displayed. Each point represents an embedding (aggregation distance 50 μm) sampled from only the dendrites of mouse layer-5 pyramidal cells. 
  1. Out-of-distribution input detection via Gaussian processes: The remaining issue with all these applications was what if the image content falls outside the training data distribution. How can we quantify how far a given image is from the train data? To solve this, the author used Spectral-normalized Neural Gaussian, which added a prediction uncertainty to the model output and calibrated that uncertainty to reflect the distance between the test data and the training distribution. This uncertainty calibration allows the OOD inputs to be rejected rather than ambiguous classifications. We can set a appropriate threshold on the uncertainty for a given task.

In conclusion, SegCLR captures rich cellular features and can greatly simplify downstream analyses compared to working directly with raw image and segmentation data. However, SegCLR has two major limitations besides requiring instance segmentation for voxel. First, the 32-40 nm voxel resolution of input views hinder the capturing of finer EM ultrastructures, like vesicles, subtypes, or ciliary microtubule structures. Second, masking of input excludes the context outside the current segment, which can be useful in cases of segmentation and classification. 

The most powerful application of SegCLR demonstrated in the paper is to classify neuronal and glial subtypes even from small fragments, which is a challenging task even for human experts. 

Check out the Paper and Reference article. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.

Vineet Kumar is a consulting intern at MarktechPost. He is currently pursuing his BS from the Indian Institute of Technology(IIT), Kanpur. He is a Machine Learning enthusiast. He is passionate about research and the latest advancements in Deep Learning, Computer Vision, and related fields.

↗ Step by Step Tutorial on 'How to Build LLM Apps that can See Hear Speak'