Adaptive clustering of time series
Résumé
This paper focuses on the cell division cycle insuring the proliferation of cells and which is dras- tically aberrant in cancer cells. The aim of this biological problem is the identification of genes characterizing each cell cycle phase. The identification process is commonly based on a prior set of well-characterized cell cycle genes, called reference genes. The expression levels of the studied genes are measured during the cell division cycle. Each studied gene is assigned a cell cycle phase by it's peak similarity to the reference genes. This classical approach suffers of two limitations. On the one hand, the most widely used proximity measures between gene expression profiles are based on the closeness of the values regardless to the similarity with respect to (w.r.t.) the genes expression behavior. On the other hand, many different ill-founded sets of reference genes are pro- posed in the literature, and biologists are not agree about those of genes best characterizing the observed cell cycle phases. Our aim in this paper is twice. We propose a new dissimilarity index for gene expression profiles to include both proximity measures w.r.t. values and w.r.t. behavior. An adaptive unsupervised classification, based on the proposed dissimilarity index, is then performed to identify the cell cycle phases of the studied genes. Finally, we propose a new set of reference genes, well-assessed by a biological knowledge.
Origine | Fichiers produits par l'(les) auteur(s) |
---|