People of ACM - Martin Reddy
December 17, 2019
What were some of the challenges you faced making movies like “Finding Nemo” and “The Incredibles,” and what did you learn from those experiences?
Working at Pixar, I learned firsthand how mixing great software engineers with talented artists can produce amazing results. A Pixar movie is driven by the story and other creative departments. They define the characters, scenes, visuals, and emotions that the technical team must then strive to attain. Very often this meant doing something that had never been done before. For example, on “Finding Nemo,” some computer scientists were able to write algorithms to produce realistic water simulations. However, the creative team’s notes were things like trying to make a group of waves bigger or to make it look more like reference footage they shot on location—something that’s particularly difficult to communicate to a computer program.
Similarly, one of the biggest challenges in “The Incredibles” was producing realistic cloth and hair. At that time, simulating long or wet hair was very difficult to do. The script called for multiple characters to be in situations with both long and wet hair. So the simulation team had to push the boundary of the current state of the art. It was the drive from the artists to make a great movie that made them happy to do this. I recall Steve Jobs at the wrap party for “Finding Nemo” saying that we would all look back on the film as one of the most remarkable achievements of our careers. In hindsight, I think he was right.
As software engineers, we enable this kind of success by designing and writing expressive authoring tools for artists to use. The same attention to detail that we paid to the final movie had to also be applied to the tools that we gave our artists to make sure that they could work efficiently and that they matched their actual day-to-day workflows. Along the way, we felt that the original code base had reached the end of its useful lifetime, and we had to build an entirely new filmmaking system from scratch. This task was likened to completely changing out the engine for a one-of-a-kind supercar. And you also had to change it while the car is doing 150 mph without stopping, because we still needed to ship all of our films on time. The end result was a software system called Presto, which was recently awarded a Technical Academy Award by the Academy of Motion Picture Arts and Sciences. The lessons I learned while helping to build that system ultimately led me to publish API Design for C++ to share the lessons of building large-scale robust production software systems.
How did you move from your earlier work in computer animation and graphics to your more recent work in conversational AI and voice assistants?
While 3D graphics and conversational AI may seem unrelated at first, there’s actually a common thread that connects them: using new forms of media to advance the art of storytelling. While at Pixar, we used the medium of 3D computer animation to bring new kinds of stories to the world. As that technology matured, and as speech recognition and AI started to improve, my co-founder Oren Jacob and I wondered what it might mean to use the medium of computer conversation to tell different kinds of stories. That is, “What would it mean to hold a deep engaging, and perhaps even entertaining, conversation with a computer?” We knew that conversation itself can be engaging and fun–that’s why comedians have jobs–but could we achieve that level of conversational fidelity with a computer?
While there’s still a lot to do to reach that goal, we did take some first baby steps along the way. In partnership with Mattel, we produced a Barbie doll that young children could talk to, with some of the longest single conversations lasting more than four hours. For adults, we worked with HBO and 360i to produce a Westworld Alexa Skill game. With 36 voice actors, 11,000 lines of script, 60 player-generated paths, and 32 ways to die, it was one of the most ambitious voice games of its time and won more than a dozen awards including a Cannes Lion Grand Prix and several Clio awards. We saw players regularly interacting with this skill for over an hour at a time.
The key to achieving these kinds of engagement numbers was building a talented team of creative writers who came from various backgrounds such as TV/movie scriptwriters, actors, and sound engineers. Building on what we learned at Pixar, we designed powerful conversational authoring tools to let them express their creative visions and to push the art of voice-based storytelling forward. There’s certainly still more to achieve, but from my work at Pixar creating 3D filmmaking software, to my work at PullString and Apple on conversational AI, I feel strongly that putting great authoring tools into the hands of talented technical and creative professionals is what you need to push the art of storytelling forward.
You now work in the AI/ML division at Apple. Will you discuss the limits of machine learning, as well as what approaches hold promise for taking conversational AI to the next level?
I see AI as a broad field, of which ML is but one technique, albeit a powerful one that is making great strides. However, I think it’s important to appreciate that not all AI problems can or should be solved with ML. You should always pick the right tool for the right job and not be distracted by the new shiny technology.
ML has shown enormous improvements in pattern-matching problems where lots of data are available for training. For example, the fields of automatic speech recognition (ASR) and text-to-speech (TTS) have seen significant jumps in accuracy over the last few years, with some companies claiming better-than-human accuracy for ASR. However, problems that require stateful or deterministic execution of logic—such as crafting multi-turn stateful and engaging flows of conversation—are more challenging to solve with ML alone.
As we look to the future, some of the big issues that face ML are the potential for bias in the datasets used to train the models and finding ways to work with less data that produce smaller models that can fit on user devices. Apple has publicly announced that privacy is a fundamental human right and so finding ways to solve ML problems with less data and less human grading is important. One way to do this is to try to use signals on the device that are not sent back to the server and thus never shared with the company. This is a large challenge for the field in general, but it’s one that we need to strive for in order to respect users’ privacy.
What’s the best advice you’ve received from a mentor that you would pass on to a younger colleague?
One of the biggest challenges of many careers is making the transition to management and having to successfully lead teams of people. Often team problems are more difficult to solve than technical ones, but they can also be immensely more rewarding to solve because you have the power to improve people’s lives.
My first manager was the late Yvan G. Leclerc at SRI International and he was a major influence on me in this regard. Yvan had an amazing ability to make everyone enjoy coming into work and to feel valued and important. He taught me in practical terms the lessons of sharing success and owning failure. He gave me autonomy to solve interesting problems, to take on more responsibility, and then to acknowledge my successes widely. At the same time, he had the strength of character and lack of ego to be able to admit mistakes himself and not point fingers when things went wrong. He was the embodiment of the adage that your job as a manager is to make the people who work for you as successful as possible. Yvan was a genuine, thoughtful, and inspiring leader. I strive to live up to his memory every day.
Martin Reddy is a Software Engineering Manager at Apple. His research interests include conversational artificial intelligence (AI) and voice applications, while in the first half of his career he focused on 3D computer graphics and animation. He has authored over 40 publications, including authoring the textbook API Design for C++ and co-authoring the textbook Level of Detail for 3D Graphics. Reddy has also received nine patents for various software applications.
Earlier in his career, he was the cofounder and CTO of PullString Inc., a software-as-a-service (SaaS) company providing a platform to create conversational experiences and apps on various voice platforms such as Amazon Alexa and Google Assistant. He was also a Lead Software Engineer for Pixar Animation Studios, where he worked on software used in films including “Finding Nemo,” “Cars,” “The Incredibles,” “Ratatouille,” and “WALL-E.” Reddy was recently selected as an ACM Distinguished Member.