People of ACM - Thomas Zimmermann
June 9, 2022
Which project are you devoting most of your effort to right now?
Right now, I’m focused on how systems based on artificial intelligence (AI) will change software development. With AI tools, it is now much easier for anyone to write code. This raises several important questions: “How effective is code written by AI?”; “By how much does AI improve the productivity of software teams?”; “How can we measure and establish trust in AI tools?”; and “How do software developers use recommendations by AI?” Together with colleagues at Microsoft and GitHub, I’m looking at these questions for AI tools in software development. Just recently, we’ve answered some of these questions for a tool called Copilot, which uses language models to suggest entire functions to developers. I’m also looking into new models of engagement with AI that go beyond the integrated development environment (IDE) and cover the rest of the development lifecycle.
I’ve also recently worked with several colleagues on research related to developer communities. We investigated aspects such as how developers participate in open-source software for social good projects, how software engineers share their daily life through vlogs, and how software engineers dismantle stereotypes. The goal is to foster healthy, sustainable, and inclusive developer communities that grow over time and enable more people to participate in the creation of software. The software supply chain plays a critical role as developers frequently take dependencies with other projects and developers. In open-source, anyone can contribute to software. This exposes both open-source and industrial projects to supply chain attacks through their dependencies. To increase software supply chain security, we worked on the detection of “anomalicious” contributions, which are both anomalous (deviating from the standard) and potentially malicious. We also looked at dependencies that increase the risk of exposure to supply chain attacks.
What is the most striking way that mining software repositories has changed since you began working in this field? What new tools on the horizon will improve software engineering in the future?
Since the mining software repositories field started in 2004, the data availability has improved dramatically. Early research in this area looked at version control systems and bug databases for just a handful of projects. Now, research often analyzes tens of thousands of projects which are available on social coding sites like GitHub. In addition to changes and bug reports, we can now use data about code reviews, pull requests, builds, tests, app reviews, telemetry, and much more to improve software systems and to increase the productivity of software engineers. Today, many software teams include data scientists who are focused on analyzing software data.
Another striking change is that machine learning (ML) and artificial intelligence have become more powerful and prevalent than they were in the early days of mining software repositories. This has led to powerful data-driven tools to improve software productivity such as Copilot. But this is just the beginning. We will see tools that leverage ML and AI tools in all stages of software development, not just code generation. Automated AI tools will work hand in hand with software developers. The focus of software engineering will move from writing code from scratch to reviewing code written and tested by AI.
Will you tell us a little about the SPACE framework for software developer productivity that you and your colleagues outlined in the January/February 2021 issue of ACM Queue? Why is it important that we challenge outmoded myths about software productivity?
Productivity is a complex machinery with many knobs, bells, and whistles. In our group, we’ve worked for over ten years with many teams at Microsoft and other companies to improve software developer productivity. Among other things, we’ve looked at the impact of managers, work environments, satisfaction, and good work days on productivity.
During this research, we observed that there are several misunderstandings and myths about productivity that negatively impact decision making and software teams. One misunderstanding is that productivity is all about developer activity, but more activity does not automatically mean more productivity. For example, when developers are working overtime, just because they spend more time to produce code doesn’t necessarily mean they more productive. Another misunderstanding is that there is a single universal metric of productivity that can be used for every decision. But there are several important dimensions of work that are all connected. Producing more code may lead to lower quality of code. Another common misunderstanding is that only engineering systems and developer tools matter for productive developers. While tools certainly have a big impact, human factors such as wellbeing, work culture, mentoring, and work environment have a strong influence on developer productivity as well.
To provide a new way of thinking about developer productivity, we introduced the SPACE framework for software developer productivity. SPACE captures five unique dimensions giving us a more complete picture of what developer productivity looks like. The dimensions are satisfaction and wellbeing, performance, activity, communication and collaboration, and efficiency and flow. A key point of SPACE is to think about productivity beyond one dimension. We recommend covering at least three of the five dimensions. Another key point of SPACE is that there are different lenses of productivity: individuals, people who work together (the team or group), and end-to-end work (the overall system). Each of these lenses may have different productivity measurements in the SPACE universe.
In a recent issue of ACM Transactions on Software Engineering and Methodology (TOSEM), you and your co-authors examined the impact of software engineers working from home due to the COVID-19 pandemic. What is an important takeaway from this research that will impact the field going forward?
Working from home has affected software engineers very individually. We observed a “Tale of Two Cities” effect in which many employees reported that their personal productivity had not changed or had even improved. However, a substantial portion of employees (32–38%) reported being less productive. This was because of a dichotomy in developer experiences. For example, being close to family was a benefit for some but a challenge for others due to interruptions. Our group also investigated how working from home affected the productivity of software teams and what strategies teams used for remote onboarding of new software developers at Microsoft. Hundreds of researchers across Microsoft, LinkedIn, and GitHub came together to study the new future of work and just recently published their findings in the 2022 Microsoft New Future of Work Report.
An important takeaway from the work in our group is the importance of wellbeing and work-life balance for employees. We've already seen many companies improve their wellness programs. Going forward, we will see much more flexibility in the workspace. Software engineers will want to work anywhere with anyone at any time on any device.
Thomas Zimmermann is a Senior Principal Researcher in the Productivity and Intelligence and Software Analysis and Intelligence groups at Microsoft Research. His research interests include software engineering, data science, and recommender systems. He is known for the systematic mining of software data to build tools to increase productivity of software engineers and managers.
Zimmermann is the Chair of the ACM Special Interest Group on Software Engineering (SIGSOFT). He has received numerous awards, including seven Ten Year Most Influential Paper awards at various conferences as well as five SIGSOFT Distinguished Paper Awards. He was recently named an ACM Fellow for contributions to mining software repositories and defect prediction.