Ben Galili - From Stable Statistics to Robust Differentially Expressed Genes

Statistical hypothesis test is a fundamental tool in statistics, where one can determine whether to accept or reject the null hypothesis (the default statement of no difference). Rejecting the null hypothesis means that the probability of getting the observed results is very low - lower than a predetermined significance level. This probability is called a p-value and the lower the p-value is the more significant the results. When applying a statistical test to a binary labeled dataset, we are implicitly assuming that the association of a subject, or, more generally, a sample, to one of the two labels, is not in doubt. In reality, however, this assumption is often compromised. Wrong sample labels can lead to dramatically different statistical assessments.
For example - when assessing treatment efficiency using Mantel's Log-Rank test, we can observe, as will be shown in the talk, that the resulting p-value can be sensitive to a very small number of labeling swaps. In work with Anat Samohi and Zohar Yakhini we developed an efficient (low poly time, in the number of changes allowed and in the number of samples) algorithmic approach to compute, for a given dataset, a stability interval for the Log Rank test. The stability interval provides bounds on the changes, in the p-value, that may result from swapping a constrained number of labels in the data. In further work we developed an approach to assessing differential expression under the same labeling sensitivity approach. For any threshold q we efficiently compute bounds on the number of genes that would be reported with a Benjamini-Hochberg FDR threshold of q, assuming constrained label changes. We further use this process to suggest a more robust procedure to identifying differentially expressed genes.   

Date and Time: 
Thursday, May 19, 2022 - 13:30 to 14:30
Ben Galili
Speaker Bio: 

Ben completed his MSc degree in CS at IDC in 2016. He is now a PhD candidate at the Faculty of CS in the Technion, supervised by Prof Zohar Yakhini. Ben has taught many ML and statistics courses in IDC during the past 5 years. He is also working as a data scientist in Dynamic Infrastructure - an Israeli predictive maintenance startup.