Compositionality in Computer Vision

Important Note

CLICK HERE (https://cicv.stanford.edu/#schedule) for links to keynote talks, panel discussion, opening/closing remarks and most up-to-date information.

New link for panel discussion (2:30-3:15pm): live zoom webinar

The workshop is composed of keynote talks, panel discussion, oral sessions and poster sessions.

Opening/closing remarks and keynote talks are NOT included in this page.

About

In our workshop, we will discuss compositionality in computer vision — the notion that the representation of the whole should be composed of the representation of its parts. As humans, our perception is intertwined greatly by reasoning through composition: we understand a scene by components, a 3D shape by parts, an activity by events, etc. We hypothesize that intelligent agents also need to develop compositional understanding that is robust, generalizable, and powerful. In computer vision, there was a long-standing line of work based on semantic compositionality such as part-based object recognition. Pioneering statistical modeling approaches have built hierarchical feature representations for numerous vision tasks. And more recently, recent works has demonstrated that concepts can be learned from only a few examples using a compositional representation. As we move towards higher-level reasoning tasks, our workshop aims at revisiting the idea and reflecting on the future directions of compositionality.

Keynote Speakers

  • Jitendra Malik, University of California, Berkeley (live talk 11:00-11:45 PDT)
  • Aude Oliva, Massachusetts Institute of Technology
  • Chelsea Finn, Stanford University
  • Animesh Garg, University of Toronto
  • Angjoo Kanazawa, University of California, Berkeley (live talk 13:45-14:30 PDT)
Teaser picture for paper
This paper makes a first step towards compatible and hence reusable network components. Rather than training networks for different tasks independentl
    Authors: Michael Gygli, Jasper Uijlings, Vittorio Ferrari   
    Keywords:  Compatible Representations, Similarity of Network Representations, Composable Architectures, Transfer Learning, Self-supervised learning
Mond Jun15  
12:30 PM - 1:00 PM
Favorite
Teaser picture for paper
We investigate the information, called the latent class structure, encoded in the shared components of a Classification-By-Components network.
    Authors: Lars Holdijk   
    Keywords:  Classification-By-Components, CBC, compositionality, components, structure, shared, interpretability, explainability, ImageNet
Mond Jun15  
12:30 PM - 1:00 PM
Favorite
Teaser picture for paper
To generate CityScapes scenes, we first use a GAN to generate segmentation maps and then use image to image translation to fill in textures.
    Authors: Anna Volokitin, Ender Konukoglu, Luc Van Gool   
    Keywords:  generative adversarial networks, scene modelling, image generations
Mond Jun15  
12:30 PM - 1:00 PM
Favorite
Teaser picture for paper
We show compact, inspectable representations without losing performance and propose an inspectability metric.
    Authors: Max Losch, Mario Fritz, Bernt Schiele   
    Keywords:  Interpretability, Inspectability Metric, Semantic Segmentation
Mond Jun15  
12:30 PM - 1:00 PM
Favorite
Teaser picture for paper
    Authors: Panelists: Jitendra Malik, Aude Oliva, Chelsea Finn, Animesh Garg, Angjoo Kanazawa; Moderator: Ranjay Krishna   
Mond Jun15  
2:30 PM - 3:15 PM
Favorite
Teaser picture for paper
We introduce a deep compositional model that is much more robust to partial occlusion compared to standard deep networks at image classification.
    Authors: Adam Kortylewski, Ju He, Qing Liu, Alan Yuille   
    Keywords:  image classification, partial occlusion, compositional model, out of distribution, analysis by synthesis, robustness, deep learning
Mond Jun15  
3:45 PM - 4:05 PM
Favorite
Teaser picture for paper
A large-scale knowledge base with part state annotations and a hierarchial paradigm with part-level activity representation (Activity2Vec).
    Authors: Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Shiyi Wang, Hao-Shu Fang, Ze Ma, Mingyang Chen, Cewu Lu   
    Keywords:  Activity Understanding, Hierrarhical Paradigm, Part States, Knowledge Base
Mond Jun15  
4:05 PM - 4:25 PM
Favorite
Teaser picture for paper
Hierarchies exist in action data, on the hyperbolic space, hierarchy is beneficial for both hierarchical retrieval, and standard retrieval, even for z
    Authors: Teng Long, Pascal Mettes, Heng Tao Shen, Cees Snoek   
    Keywords:  video retrieval, hyperbolic learning, hierarchical, zero-shot learning, action recognition, hyperbolic geometry,
Mond Jun15  
4:25 PM - 4:45 PM
Favorite
Teaser picture for paper
In this paper, we address the problem of recognizing complex compositional activities described as regular expressions of atomic actions in videos.
    Authors: Rodrigo Santa Cruz, Anoop Cherian, Basura Fernando, Dylan Campbell, Stephen Gould   
    Keywords:  Compositional Action Recognition, Complex Action Recognition, Probabilistic Automata
Mond Jun15  
4:45 PM - 5:15 PM
Favorite
Teaser picture for paper
In this paper, a complex action in a still image is broken down into components based on semantics. The importance of each of these components for ac
    Authors: Deeptha Girish, Vineeta Singh, Anca Ralescu   
    Keywords:  Action recognition, still image, semantic compositionality, feature importance
Mond Jun15  
4:45 PM - 5:15 PM
Favorite