People of ACM - Pavel Pevzner
September 24, 2019
How did you initially become interested in computational biology?
I was completing my PhD on combinatorial optimization of transportation networks in Moscow, and suddenly realized that I was bored—I wanted to work on something more exciting. I was fortunate to be introduced to open algorithmic problems in a new futuristic discipline called computational molecular biology. I immediately abandoned my PhD in transportation networks (never regretted it!) and finished my PhD in bioinformatics three years later.
Why are algorithms for string reconstruction especially effective in understanding the human genome?
In a nutshell, the human genome is a 3-billion letters long string made of A, C, T, and G nucleotides. Decoding this string (human genome sequencing) is a challenging algorithmic problem that was partially solved in 2000. However, some highly repetitive regions of the human genome have defied all previous attempts to sequence them, and their reconstruction remains an open algorithmic problem. Since these regions have a large and still poorly understood connection to disease, the recently formed Telomere-to-Telomer consortium aims to fully complete the human genome, two decades after its completion was announced at the White House! As the first step towards this goal, the consortium just completed human sex chromosome, the first chromosome to be truly completed.
What is an emerging research area of molecular bioinformatics that may not have gotten enough attention but will yield important advances in the coming years?
Sequencing the “dark matter of the human genome” (such as centromeres that are responsible for chromosome segregation) is a still-unsolved algorithmic problem that may shed light on how these enigmatic and biologically important regions contribute to cancer and infertility. Revealing this dark matter will also help us to answer the important questions about human evolution as large regions of archaic Neanderthal DNA were recently found in human centromeres.
As an educator who has written several textbooks in this area, what have you learned about how to impart these ideas to students and make them passionate about the field?
As Benjamin Bloom (the giant of educational psychology) demonstrated, a traditional classroom lecture is minimally effective. That is why I haven’t given a traditional classroom lecture in six years. And I haven’t been fired yet….
Instead, I give flipped classes based on my online Bioinformatics Algorithms courses on Coursera and edX. I feel that the existing university education (based on a traditional classroom) will be disrupted by Intelligent Tutoring Systems that are now being developed. That is why I don’t write textbooks anymore—I now work on MOOCBooks for massive open online courses that I teach and that nearly half a million students have enrolled in in the last six years.
Pavel Pevzner is the Ronald R. Taylor Chair and Distinguished Professor of Computer Science at the University of California, San Diego (UCSD). Pevzner’s research interests span the field of computational biology, and his work has been guided by tailoring algorithmic ideas to biological problems. His algorithms have been applied to a wide range of areas including decoding genomes, antibodies, and antibiotics. He co-developed two popular online specializations on Coursera in Bioinformatics and Data Structures and Algorithms and an online MicroMaster Program on Algorithms and Data Structures at edX, and has written several textbooks on bioinformatics and computational biology.
Pevzner received a PhD in Mathematics and Physics from the Moscow Institute of Physics and Technology. An ACM Fellow, Pevzner was awarded the 2018 ACM Paris Kanellakis Theory and Practice Award for pioneering contributions to the theory, design and implementation of algorithms for string reconstruction and to their applications in the assembly of genomes.