Statistics in ranked lists plays an important role in analyzing molecular biology measurement data. For example—differential expression analysis yields ranked lists of genes. These ranked lists are analyzed to infer rules and properties. Statistical enrichment methods are of specific interest in this context allowing for the identification of gene sets or sequence elements that are over-represented in the top of measurement derived ranked lists.
In recent years my group has developed rigorous approaches to statistically assess enrichment while allowing flexibility in defining the top of the list. In the most recent extension we developed methods and software that statistically evaluate the mutual enrichment of two ranked lists.
The methods are based on characterizing tail distributions in null models defined to represent non enrichment. In this talk I will describe the mathematical and algorithmic foundations of the methods and show results and applications from bioinformatics, including cancer differential expression and the relationship between age related DNA methylation and cancer related epigenetic changes.
Dr Zohar Yakhini is Master Scientist at Agilent Laboratories and Adjunct Faculty in the Computer Science Department, Technion, Haifa. Dr Yakhini leads a group of computational biologists working on information aspects of genomics, proteomics and glycomics. He earned a BSc in mathematics and computer science at the Hebrew University in Jerusalem and a PhD in mathematics at Stanford University, 1997. He is working in computational biology and bioinformatics since after graduation, focusing on statistical and algorithmic aspects of high throughput measurement technologies and synthetic biology. Dr Yakhini led data analysis in several early gene expression studies and then led the development of probe design and data analysis methods and software tools, for Agilent’s aCGH microarray platform. Dr Yakhini’s group developed several data analysis tools that are widely used by the genomics community, including differential expression and statistical enrichment analysis tools, such as GOrilla.