People of ACM - Yiran Chen
May 24 2022
What has surprised you the most about how the field of memory and storage systems has advanced since you entered the field?
I think the most exciting thing that has happened in the field of memory and storage systems in the past 15–20 years is the blurring boundary between computing and storage. Recent revolutions in modern computing paradigms started with the need of processing big data, which triggered the ever-increasing demand for storage devices with large capacities. This was soon bottlenecked by the limited bandwidth between the computing units and storage devices (often referred to as a “Von Neumann Bottleneck”). Making memory and storage systems more "intelligent" (e.g., near-memory computing and in-memory computing) has emerged as a popular solution to alleviate the systems' reliance on memory bandwidth and expediate the data processing. This is a great example to show how the shift of the target applications (i.e., from scientific computations to data-centric computations) reforms the design philosophy of computer architecture. This philosophical change inspired various new computing products such as intelligent solid-state drive (SSD), dynamic random-access memory (DRAM), and data processing unit (DPU), as well as numerous emerging memory technologies such as 3D Xpoint Memory (Intel and Micron). It also led to the emergence of several novel non-Von Neumann architectures such as the crossbar-based dot-product engine, which performs vector-matrix multiplications by directly mapping the computations onto the topological structure of the computing hardware.
One of your most cited recent papers is "Learning Structured Sparsity in Deep Neural Networks," which addresses the importance of enhancing the efficiency of deep neural networks. Why is enhancing the efficiency of deep neural networks important and what are some promising research directions in this area?
It is well-known that the high (inference) accuracy of modern deep neural networks (DNNs) comes with a high computational cost due to the increased depth and width of the neural networks. However, we also know that the connection weights of a neural network do not affect the accuracy of the neural network equally. When a connection weight is near zero then it is likely that the connection can be pruned (i.e., the weight value is set to zero) without affecting the accuracy of the neural network in any significant way. Our paper published in NeurIPS 2016 revealed that learning a sparse neural network in which the non-zero weights are structurally stored in the memory can maintain a good data locality and reduce the cache miss rate. Hence, the computational efficiency of the neural networks is substantially improved. The proposed technique, namely structured sparsity learning (often referred to as structured pruning) and its variations have been widely utilized in modern efficient DNN model designs and endorsed by many Artificial Intelligent (AI) computing chips such as Intel Nervana and NVIDIA Ampere.
Enhancing the efficiency of DNNs is critical because it largely hinders the scaling-up of large DNN models as well as the deployment of large models on the systems where computing and storage resources and power budget are limited, such as Edge and IoT (Internet-of-Things) devices. The latest trend in this area of research is the combination of the innovations at both algorithm and hardware levels (e.g., designing AI accelerators based on emerging nano-devices for the acceleration of new or underexplored AI models such as Bayesian models, Quantum-like models, Neuro-symbolic models, etc.)
It was recently announced that you will be directing the National Science Foundation’s AI Institute for Edge Computing Leveraging the Next-Generation Networks (Athena) project. Athena is a 5-year, $20 million effort that will involve several institutions including Duke, MIT, Princeton, Yale, the University of Michigan, the University of Wisconsin, and North Carolina Agricultural and Technical State University. What are the goals of the Athena project?
We are very excited about the establishment of Athena, a flagship AI Institute for Edge Computing sponsored by the National Science Foundation and the US Department of Homeland Security. The goal of Athena is to transform the design, operation, and service of future mobile network systems by delivering unprecedented performance and empowering previously impossible services, while keeping complexity and cost under control by advancing AI technologies. Athena organizes its research activities under four core areas: Advancing AI for Edge Computing Systems, Computer Systems, Networking Systems, and Services and Applications. Our developed AI technologies will also provide theoretical and technical foundations for future mobile networks in functionality, heterogeneity, scalability, and trustworthiness. Serving as the nexus point of the community, Athena will facilitate the ecosystem of emerging technologies and cultivate a diverse new generation of technical leaders who demonstrate the values of ethics and fairness. We anticipate that the success of Athena will reshape the future of mobile network industries, create new business models and entrepreneurial opportunities, and transform the future of mobile network research and industrial applications.
What is an exciting trend in design automation? As Chair of the ACM Special Interest Group on Design Automation (SIGDA), what role does the organization play in the field?
The most exciting trend in design automation in the past decade is the wide adoption of machine learning technologies in electronic design automation (EDA) tools. As the chip design quality largely depends on the chip designers' experience, it is a very natural idea to develop intelligent EDA tools that can directly learn how to inherit the design of semiconductor chips from previous designs without going through the classic bulky models. Various machine learning models have been embedded into the latest EDA flow to accelerate the computations of trial routing and placement, power estimation, timing analysis, parameter tuning, signal integrity, etc. Machine learning algorithms have also been implemented in the hardware modules on the chip to monitor and predict the chip’s runtime power consumption (e.g., our "APOLLO" framework that received the Best Paper Award in MICRO 2021).
As one of the largest professional societies in EDA, SIGDA is committed to advancing the skills and knowledge of EDA professionals and students throughout the world. Every year SIGDA sponsors and organizes more than 30 international and regional conferences, edits and supports multiple periodicals and newsletters, and hosts more than a dozen of educational and technical activities including workshops, tutorials, webinars, competitions, research forums, and university demonstrations. By working with our industrial partners, SIGDA also offers travel grants to young students, faculty, and professionals in assisting them to attend conferences. We also present several awards to outstanding researchers and volunteers in the community.
What is an example of a research avenue in your field that will be especially impactful in the coming years?
I believe a generalizable and interpretable AI computing hardware design flow will be the next revolutionary technology in EDA and computing systems research at large. In the past decade, various hardware designs have been proposed to accelerate the computation of AI models. However, designers always struggle between the generalization and the efficiency of the designs because many hardware customizations need to be made in order to adapt to the unique structures of the models that are ever-changing. On the other hand, interpretability has been a longstanding challenge for assuring the robustness of AI models and generalizing model designs. The future AI computing hardware design may be composed of a variety of interpretable hardware modules that correspond to their algorithmic counterparties. The performance of the AI computing hardware is guaranteed by a generalized design flow. One possible solution is to construct a composable AI model using neural symbolization methods and implement hardware modules corresponding to the symbolized algorithm modules. An extended AutoML flow can then be used to automate the design of the target AI computing hardware that achieves the required performance with guaranteed generalizability and interpretability.
Yiran Chen is a Professor at Duke University and Director of the National Science Foundation (NSF) AI Institute for Edge Computing Leveraging the Next-Generation Networks (Athena), the NSF Industry—University Cooperative Research Center (IUCRC) for Alternative Sustainable and Intelligent Computing (ASIC), and the Co-Director of Duke Center for Computational Evolutionary Intelligence (DCEI). His research interests include new memory and storage systems, machine learning, neuromorphic computing, and mobile computing systems.
Chen has authored more than 500 publications, including a book, and has won several Best Paper Awards at various conferences. His honors include the IEEE Computer Society Edward J. McCluskey Technical Achievement Award, the ACM SIGDA Service Award, and being named an ACM Fellow for his contributions to nonvolatile memory technologies. Chen is the Chair of the ACM Special Interest Group on Design Automation (SIGDA).