Yaniv Erlich: Genetic media

In the last decade, the human population has produced zettabytes (10^21) of digital data. Growing numbers of individuals are obtaining genetic information for medical and recreational purposes while simultaneously producing phenotypic datasets in the form of electronic health records, mobile sensors, and self-reported information on social networks. In my talk, I will discuss the intersection between DNA and big data. In the first part, I will present our work on using DNA molecules as a storage architecture, which can reach substantial durability, extremely small foot-print, and enable highly redundant distributed archives for cold storage. Specifically, I will report a storage strategy, called DNA Fountain, that is highly robust and approaches the information capacity per nucleotide, enable virtually infinite number of retrievals, and can reach a density of 215 petabytes per gram of DNA, orders of magnitude higher than previous reports. In the second part of my talk, I will present how we can crowd source genetic information from tens of millions of people. This dataset opens the possibility address fundamental questions in various areas from statistical genetics, via quantitative anthropology, to digital health. However, these large scale genomic datasets also open the possibility of unprecedented genetic surveillance. I will conclude my talk with a mitigation strategy based on cryptographic signatures that can mitigate the risks and restore control to data custodians.

Date and Time: 
Thursday, December 27, 2018 - 11:30 to 12:30
Speaker: 
Yaniv Erlich
Speaker Bio: 

Dr. Yaniv Erlich is the Chief Science Officer of MyHeritage.com and an Associate Professor of Computer Science and Computational Biology at Columbia University (leave of absence). Prior to these positions, he was a Principal Investigator at the Whitehead Institute, MIT. Dr. Erlich received his bachelor’s degree from Tel-Aviv University, Israel (2006) and a PhD from the Watson School of Biological Sciences at Cold Spring Harbor Laboratory (2010). Dr. Erlich’s research interests are computational human genetics. He is a TEDMED speaker (2018), the recipient of DARPA’s Young Faculty Award (2017), the Burroughs Wellcome Career Award (2013), Harold M. Weintraub award (2010), the IEEE/ACM-CS HPC award (2008), and he was selected as one of 2010 Tomorrow’s PIs team of Genome Technology.