People of ACM - Adrien Bousseau
February 6, 2024
When and how did you become interested in creating imaging tools vs. using imaging tools?
I first discovered computer graphics as a user. I was still in high school, and at the time digital photography and 3D rendering were just starting to become accessible on personal computers. I played with popular 3D modeling and image editing tools, and I was fascinated by the possibility of doing (somewhat) professional-looking images on my own. But while I enjoyed discovering the technology behind digital imaging, I wasn’t sure that I would have the artistic skills to work in that field. Luckily, I found a university that was offering a new curriculum on the topic, covering both the usage of digital imaging software (towards artistically oriented students), and the algorithms behind them (towards programing-oriented students). I quickly understood that I was more interested in programing the tools themselves than using them, as it allowed me to do more than what the existing tools could offer. While this initial curriculum was rather technical, I then engaged in a more theoretical curriculum to better understand the foundations behind these algorithms. Nevertheless, I still enjoy using imaging software to illustrate my research!
Your most downloaded paper from the ACM Digital Library is “Diffusion Curves: A Vector Representation for Smooth-Shaded Images.” What are diffusion curves, and what was a key insight you and your co-authors put forth in this paper?
This work contributes to vector graphics, which represents images with parametric shapes such as disks, rectangles, or curves made of a few control points (typically Bézier curves). Vector graphics have been used in graphics design for a long time, starting with the seminal work of the Turing recipient Ivan Sutherland, because it allows us to represent images in a compact an
d editable manner. For instance, a designer can easily change the size of a disk by modifying the value of its radius. But this parametric nature also makes vector graphics limited to simple shapes and even more so to simple color variations. Most existing tools only allow users to fill in image regions with a constant color or simple linear and radial color gradients, giving most vector graphics artworks a typical “clipart” look. Diffusion curves address this limitation. By analogy with heat diffusion, we defined the diffusion curve as a Bézier curve that is augmented with “color sources” on each of its sides. By solving a partial differential equation, the colors diffuse in the image and create complex, intricate color gradients in between the curves. Importantly, these complex color variations remain controllable via a few control points.I think that it is the contrast between the simplicity of this representation and the complexity of the images it can produce that made diffusion curves popular. However, rendering images with diffusion curves requires computing the solution of the diffusion process over the entire image domain, which is costly—especially at high resolution. This partly explains why the original paper continues to inspire research: while the definition of diffusion curves is simple and original, they are so different from other vector graphics primitives that their integration into industry-grade software still raises challenges.
Why is automatically reconstructing a drawing as a 3D model such a difficult challenge? What are some applications of your work in this area?
At its core, recovering a 3D shape from a 2D line drawing is an ill-posed problem: the drawing lacks one dimension, meaning that each line can lie anywhere in depth. The same challenge exists for other computer vision tasks, such as recovering the shape of an object from a single photograph of that object. However, line drawings pose additional challenges because they lack many of the visual cues present in photographs (shading, texture, etc.) and because when people draw, they only loosely follow the geometric rules of optics (lines are not perfectly straight, they do not intersect precisely, etc.). Moreover, line drawings take time and expertise to produce, which (for now) prevents the use of very large datasets of images to train machine learning models as has been done with photographs.
Fortunately, line drawings also offer unique opportunities compared to photographs, which is why I find them so interesting to study. In the domains of architecture and industrial design, people have developed specific techniques to best convey 3D shapes in their drawings. This includes basic 2-point perspective, where lines that are parallel in 3D converge towards the same vanishing point in the drawing, as well as more advanced techniques such as the use of so-called axis-aligned scaffolds that designers first draw to lay down the main volumes of a shape before drawing its curved details. Thanks to these techniques, designers can quickly draw perspectively-accurate representations of the shapes they have in mind, reflect about these shapes to improve their designs, and explain these shapes to their colleagues and clients. My research aims at exploiting these techniques to also allow designers to communicate shapes to computers more efficiently, and to offer a seamless transition from pen-on-paper design exploration to computer-aided design refinement and engineering.
What is one example of exciting work being done in your field that will have a significant impact in the coming years?
As with many domains recently, imaging is profoundly impacted by machine learning, and in particular by recent generative models. The ability to instruct computers to generate images based on text prompts is already changing the way artists and designers work. Yet the first generations of generative models focused on synthesizing bitmap images and textured 3D meshes, which reproduce well realistic visual content but are difficult to edit. While there are ongoing efforts to develop generative models that can also edit visual content, I do think that traditional image representations have unique advantages for editing tasks, simply because they evolved in that way. This is true of the vector graphics representations I mentioned above as well as of parametric representations of shapes that form the basis of many computer-aided design (CAD) software. The way these representations are structured often reflects the procedural way in which people think about visual content. For example, designers often think about complex 3D objects in a coarse-to-fine manner, starting with an abstract assemblage of simple 3D primitives, which are then refined to include smaller details. Representing images and shapes with such coarse-to-fine parametric representations greatly facilitates their reuse by other designers. For this reason, I am particularly curious about the field of neuro-symbolic programming, which combines the strengths of machine learning with the interpretability of symbolic procedures. I am waiting to see new generative models capable of synthesizing coarse-to-fine parametric representations of visual content, which would be interpretable and editable by humans and machines.
What kinds of training/coursework would you recommend to a younger colleague interested in pursuing a career in image creation and manipulation?
One of the reasons why I really enjoy working in computer graphics is the diversity of the field. A typical ACM SIGGRAPH proceeding includes papers related to physics, geometry, signal processing, optimization, human-computer-interaction, machine learning, programming languages, with applications in entertainment, design, robotics, architecture, education, etc. Having strong skills in computer science and applied math is definitely a plus, but there is no unique profile. Some are very good at developing novel theoretical tools that then impact many applications, others are better at implementing innovative systems that push the limits of what can be done for a particular application. My recommendation would be to identify one’s strongest skills and nurture them while trying to progress on secondary skills to achieve a decent balance. I myself am not so strong in math nor in programming, so I try to convince colleagues who are more expert to collaborate with me on problems I find interesting. But I do my best to try to understand what they explain!
Adrien Bousseau is a Senior Researcher at the French National Institute for Research in Digital Science and Technology (Inria). He focuses on helping designers communicate with computers. His interests include image creation and manipulation, shape modeling, and scene understanding. Bousseau’s work has been applied in areas including architecture, fashion, and industrial design.
His honors include a Eurographics 2011 PhD award for his research on expressive image manipulations and a Young Researcher Award from the French National Research Agency (ANR) for his work on computer-assisted drawing. In 2016 he received a European Research Council Grant for his project on the interpretation of drawings for 3D design. Bousseau has been a Program Committee Member for the ACM SIGGRAPH Conference for many years, including the upcoming 2024 conference.