People of ACM - Manish Gupta
September 9, 2021
You worked on the Blue Gene supercomputer in the early 2000s, for which IBM received a National Medal of Technology and Innovation from President Barack Obama in 2009. How was that experience?
During a period when the accepted wisdom was to build supercomputers using as powerful processors as possible, and almost all processor designers were highlighting the increasing clock frequencies of their processors, Blue Gene (credit to Alan Gara, the chief architect) took a contrarian approach and used a much larger number of less powerful, but power-efficient processors. We were required to support the scaling of individual applications to 128K processors when existing supercomputers invariably used to run into scaling problems beyond 2K processors. Many experts expected the Blue Gene project to fail or at best succeed in supporting a very restricted class of custom-coded scientific applications. However, through a careful set of design decisions and meticulous execution by a wonderful team, we were able to successfully scale a large number of scientific applications, breaking the 100K barrier for scaling of applications written using standard message passing interface (MPI). Blue Gene/L consumed about 1/100th the amount of power and floor space compared to the Earth Simulator, the machine it replaced as the fastest supercomputer on the TOP500 list. It also unexpectedly captured the top spot for all of the HPC Challenge Benchmarks in 2005.
It was a rich learning experience for many of us. One of my takeaways was that, as a researcher, at least once in your lifetime (and preferably more often), it is good to target something that the world thinks is virtually impossible.
You are known for your work in high performance computing software and compiler optimizations. What factors have impacted this field since you were named an ACM Fellow in 2012?
We have seen enormous progress in this field, fueled by continuing exponential growth in compute power. Machine learning, and in particular, deep learning has emerged as the dominant workload. Specialized architectures like graphics processing units (GPUs) and tensor processing units (TPUs) have been used to accelerate these workloads, helping advance their performance well above the historical rate in accordance with Moore’s Law. Unlike the traditional scientific computing applications, which have been usually programmed at a rather low level to obtain high performance, the use of high-level languages like Python and frameworks like TensorFlow, PyTorch and Keras, have made machine learning far more accessible to programmers and enabled high productivity. Scientific computing applications have also taken advantage of GPUs in a significant manner, utilizing several hundred million cores in high-end supercomputers.
One area your team has been involved with is developing machine learning systems that will address the challenges arising from the multitude of languages in India. What are your efforts in this area?
Only about 11% of people in India speak English, while there are over 1,500 languages spoken in the country, including 122 major languages and 22 scheduled (official) languages. Our goal is to democratize access to information for every person by supporting access in their native language. Most of these languages have low availability of web resources, and their usage is associated with additional challenges like code mixing (mixing of languages in the same sentence), non-unique transliterations, idiomatic misspellings and mispronunciation of English words influenced by the native language of the user. Hence, the state-of-the-art natural language processing (NLP) models do not work very well in these scenarios.
Our teams, led by Partha Talukdar and Aravindan Raghuveer, have been addressing these challenges by building multilingual models (which can handle code mixing in a seamless manner), better transliteration models that capture phonetic and spelling variations, increasing coverage of knowledge graphs with entities from new locales, and improving machine translation for low web-resource languages. We have developed Multilingual Representations of Indian Languages, (MuRIL) which covers 16 Indian languages and English, and has led to significant accuracy gains on various NLP tasks like named entity recognition, classification, and question answering for both native and transliterated text. The MuRIL-based pretrained models have been released on TensorFlow Hub and HuggingFace, open source providers of NLP technologies. We have already seen practitioners take advantage of these models. A lot more needs to be done, but we are excited about the journey ahead.
What is an area of research you have been involved with that may not have received attention, but will have a significant impact in the coming years?
I am personally excited about the potential of transforming healthcare with the help of further advances in technologies like artificial intelligence (AI) and machine learning (ML). This area is surely attracting a great deal of attention, but I believe that we have barely scratched the surface. AI can play a key role in making healthcare more proactive, which, like the proverbial stitch in time, can lead to better health outcomes for people while lowering costs.
We are focusing our research on how we can analyze data from mobile devices and wearables used by people and help them lead healthier lives as well as alert them to seek medical intervention when they are at a high risk for specific diseases. Doing it well requires significant advances in various aspects of ML (including privacy-preserving ML, explainable and fair ML, handling class imbalance) as well as human-computer interaction (including behavioral science and persuasive conversational assistants). Such an approach, based on wellness and proactive care, can reduce the overall disease burden for humanity significantly, potentially saving millions of disability-adjusted life years.
What advice would you give to young computer science researchers and engineers starting their careers?
First, I would convey a sense of excitement that they are starting their careers at a very opportune moment in computing history. Thanks to decades of exponential advances in computational power, computer science has evolved from a young discipline to a field that is impacting virtually every aspect of how the world runs today. Consequently, today’s young researchers and engineers have an unprecedented ability to apply their talent towards a cause that they are passionate about and make the world a better place. Second, there are a huge set of open research problems in almost every area of computer science. As a case in point, despite all of the amazing advances in deep learning, we still have a rather limited understanding of why these methods work so well (when they do), and they continue to fail in spectacular ways (e.g., confident mispredictions and fairness issues).
Finally, be bold and pursue ambitious goals; as you look back on your career, you will derive much more personal satisfaction and receive greater recognition from the community for a few impactful accomplishments rather than a laundry list of incremental contributions and publications.
Manish Gupta is Director of Google Research India, and the Infosys Foundation Chair Professor at the International Institute of Information Technology (IIIT) Bangalore. He has authored more than 75 papers in areas including parallel computing, high performance compilers, and Java virtual machine optimizations. At Google Research India, he leads a team that is working in areas including machine learning, natural language understanding, computer vision, and multi-agent systems. Earlier in his career, Gupta led VideoKen, a video technology startup, and the research centers for Xerox and IBM in India. Gupta is a member of the ACM India Council.
His honors include an Outstanding Innovation Award and two Outstanding Technical Achievement Awards from IBM, as well as a Distinguished Alumnus Award from IIT Delhi. Gupta is a Fellow of the Indian National Academy of Engineering and was named an ACM Fellow for contributions to high performance computing software and compiler optimizations.