UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets
Abu-Jamous, B., Fa, R., Roberts, D. J., & Nandi, A. (2015). UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets. BMC Bioinformatics, 16(4 June), Article 184. https://doi.org/10.1186/s12859-015-0614-0
Published in
BMC BioinformaticsDate
2015Copyright
© 2015 Abu-Jamous et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.
Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently
proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from
multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided
datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the
design of such methods. Moreover, although it is a common practice to test methods by application to synthetic
datasets, the mathematical models used to synthesise such datasets are usually based on approximations which
may not always be sufficiently representative of real datasets.
Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets
using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently
co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the
subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation
technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally,
we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines
the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore
overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets,
we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance,
biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets
as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses
regarding the function of a few previously unknown genes in those focused clusters are drawn.
Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach
will have wide application for the comprehensive analysis of genomic and other sources of multiple complex
biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future
functional studies.
...
Publisher
BioMed Central Ltd.ISSN Search the Publication Forum
1471-2105Publication in research information system
https://converis.jyu.fi/converis/portal/detail/Publication/24759333
Metadata
Show full item recordCollections
License
Except where otherwise noted, this item's license is described as © 2015 Abu-Jamous et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.
Related items
Showing items with similar title or keywords.
-
Dataset from RNAseq analysis of differential gene expression among developmental stages of two non-marine ostracodes
Vences, Miguel; Anslan, Sten; Sabino-Pinto, Joana; Bonilla-Flores, Mauricio; Echeverría-Galindo, Paula; John, Uwe; Nass, Benneth; Pérez, Liseth; Preick, Michaela; Zhu, Liping; Schwalb, Antje (Elsevier, 2024)We contribute transcriptomic data for two species of Ostracoda, an early-diverged group of small-sized pancrustaceans. Data include new reference transcriptomes for two asexual non-marine species (Dolerocypris sinensis and ... -
Comprehensive analysis of forty yeast microarray datasets reveals a novel subset of genes (APha-RiB) consistently negatively associated with ribosome biogenesis
Abu-Jamous, Basel; Fa, Rui; Roberts, David J.; Nandi, Asoke (BioMed Central Ltd., 2014)Abstract. Background: The scale and complexity of genomic data lend themselves to analysis using sophisticated mathematical techniques to yield information that can generate new hypotheses and so guide further ... -
Causal Effect Identification from Multiple Incomplete Data Sources : A General Search-Based Approach
Tikka, Santtu; Hyttinen, Antti; Karvanen, Juha (Foundation for Open Access Statistic, 2021)Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating ... -
Transforming student contributions into subject-specific expression
Jacknick, Christine M.; Duran, Derya (Elsevier, 2021)Drawing on a corpus of pre-service teacher training classroom interactions in an English-medium instruction university in Turkey, we examine teacher follow-up turns that introduce specialized terms, showing how a teacher ... -
Gene expression in TGFbeta-induced epithelial cell differentiation in a three-dimensional intestinal epithelial cell differentiation model
Juuti-Uusitalo, Kati M; Kaukinen, Katri; Mäki, Markku; Tuimala, Jarno; Kainulainen, Heikki (BioMed Central (BMC), 2006)Background. The TGFβ1-induced signal transduction processes involved in growth and differentiation are only partly known. The three-dimensional epithelial differentiation model, in which T84 epithelial cells are induced ...