Agenda

PhD Thesis Defence

Thursday, 21 May 2026
10:00-11:30
Aula Senaatszaal

Compositional Generative Models: for Generalizable Scene Generation and Understanding

Yanbo Wang

Human intelligence is fundamentally compositional: it constructs new ideas by flexibly recombining known concepts, enabling generalization to entirely new tasks. We aim to develop intelligent systems with similar robust generalization capabilities. To that end, we develop compositional generativemodeling frameworks and present three research thrusts that advance scene generation, decomposition, and understanding.

First, we introduce a hierarchical object-centric generative model that integrates latent variable modeling with object-centric representation learning, enabling coherent multiobject scene generation and fine-grained object-level editing. This approach overcomes limitations of prior object-aware models by supporting flexible object morphology and significantly improving in-distribution generalization.

Second, we propose an unsupervised compositional image decomposition method that represents images as compositions of energy landscapes encoded by diffusionmodels. This enables the extraction of reusable global and local visual factors, such as shadows, expressions, and objects, and supports zero-shot compositional image generation by recombining these factors into novel configurations far outside the training distribution.

Third, we develop a compositional inverse generative modeling framework for scene understanding. By formulating inference as likelihood maximization over conditional generative model parameters, we show how composable diffusion models enable object discovery andmulti-label classification in scenes substantially more complex than those seen during training, including generalization to images with more objects or new configurations. The framework also supports zero-shot category inference using pretrained generative models without additional training.

Overall, these contributions demonstrate that the incorporation of compositional structure into generative modeling yields interpretable, controllable, and significantly more generalizable intelligent systems. This thesis offers a step toward building intelligent agents with the flexible, systematic compositional imagination characteristic of human cognition.

Additional information ...

Microelectronics Colloquium

Thursday, 21 May 2026
14:30-15:00
EWI Lecture Hall Pi

An information theoretic look into AI: conformal prediction and reasoning

Arash Behboodhi
Qualcomm AI Research

When agents and systems operate under additional constraints, a fundamental question concerns the limits of achievable performance and the emergent behaviors induced by those constraints. Such questions are central to information theory: for example, lossy compression under distortion constraints or communication under power or amplitude constraints. Characterizing these fundamental limits is essential both for evaluating proposed algorithms and for deriving principled design guidelines.

In this work, we examine several analogous problems in the AI domain, including conformal prediction under efficiency (prediction set size) constraints, the confidence-efficiency trade-off in transductive learning, and efficient neural reasoning. We study these problems through an information-theoretic lens, establishing fundamental bounds and providing design principles.

Bio:

Arash Behboodi is Director of Engineering at Qualcomm AI Research, where he has led several major initiatives, including the Qualcomm 5G AI Suite and Efficient Reasoning on the Edge. His research contributions span information theory, compressed sensing, and machine learning for wireless communication, learning theory, and machine learning for inverse problems. He did seminal works on differentiable simulation for wireless propagation modeling. He has received multiple best paper awards, including recognition at venues such as Asilomar. Arash earned his PhD in Information Theory from École Supérieure d’Électricité (Supélec, now CentraleSupélec), and his bachelor's and master's degrees from Sharif University of Technology. He also holds a master's degree in philosophy from Panthéon-Sorbonne University, where he focused on the philosophy of language.