Seminar 13/04: ‘Model-Based Clustering of Flow and Mass Cytometry Data’

On Thursday, April 13th, the seminar series hosted a talk from Ultán Doherty, PhD Candidate in Trinity College Dublin, introducing a novel model-based clustering approach for detecting populations in cytometry data. Details for the talk are below.

Title

Model-Based Clustering of Flow and Mass Cytometry Data

Abstract

Flow and Mass Cytometry are techniques use in immunological research to measure the expression levels of a range of protein markers for each cell in a tissue sample. To investigate how a disease affects different cell types, the cells in an analysed sample must be assigned to populations. Traditionally, this process was carried out manually, however increases in the number of protein markers which can be measured simultaneously have led to interest in adopting automated clustering algorithms from statistics and machine learning.

We are developing a semi-supervised approach for carrying out model-based clustering of cytometry data. Our method offers immunologists the opportunity to assist the model in identifying known populations. An expert can provide information about which markers the cells belonging to a population are known to express or not express. Based on this description, we select a set of events which should belong to that population and impose must-link constraints between them. These constraints require all events in a constrained set to be assigned to the same cluster. A modified EM algorithm is used to fit a Gaussian mixture model to the data while ensuring that the clustering solution satisfies the constraints.

We have applied this method to a benchmark data set and will present the results compared to an unconstrained Gaussian mixture model.

Leave a comment