Publications

Year: Author:

Paul Voigtlaender, Yuning Chai, Florian Schroff, Hartwig Adam, Bastian Leibe, Liang-Chieh Chen
CVPR 2019

Many of the recent successful methods for video object segmentation (VOS) are overly complicated, heavily rely on fine-tuning on the first frame, and/or are slow, and are hence of limited practical use. In this work, we propose FEELVOS as a simple and fast method which does not rely on fine-tuning. In order to segment a video, for each frame FEELVOS uses a semantic pixel-wise embedding together with a global and a local matching mechanism to transfer information from the first frame and from the previous frame of the video to the current frame. In contrast to previous work, our embedding is only used as an internal guidance of a convolutional network. Our novel dynamic segmentation head allows us to train the network, including the embedding, end-to-end for the multiple object segmentation task with a cross entropy loss. We achieve a new state of the art in video object segmentation without fine-tuning on the DAVIS 2017 validation set with a J&F measure of 69.1%.

» Show BibTeX

@inproceedings{Voigtlaender19CVPR,
title={{FEELVOS}: Fast End-to-End Embedding Learning for Video Object Segmentation},
author={Paul Voigtlaender and Yuning Chai and Florian Schroff and Hartwig Adam and Bastian Leibe and Liang-Chieh Chen},
booktitle={CVPR},
year={2019}
}






Christoph Gissler, Andreas Peer, Stefan Band, Jan Bender, Matthias Teschner
ACM Transactions on Graphics

We present a strong fluid-rigid coupling for SPH fluids and rigid bodies with particle-sampled surfaces. The approach interlinks the iterative pressure update at fluid particles with a second SPH solver that computes artificial pressure at rigid body particles. The introduced SPH rigid body solver models rigid-rigid contacts as artificial density deviations at rigid body particles. The corresponding pressure is iteratively computed by solving a global formulation which is particularly useful for large numbers of rigid-rigid contacts. Compared to previous SPH coupling methods, the proposed concept stabilizes the fluid-rigid interface handling. It significantly reduces the computation times of SPH fluid simulations by enabling larger time steps. Performance gain factors of up to 58 compared to previous methods are presented. We illustrate the flexibility of the presented fluid-rigid coupling by integrating it into DFSPH, IISPH and a recent SPH solver for highly viscous fluids. We further show its applicability to a recent SPH solver for elastic objects. Large scenarios with up to 90M particles of various interacting materials and complex contact geometries with up to 90k rigid-rigid contacts are shown. We demonstrate the competitiveness of our proposed rigid body solver by comparing it to Bullet.

» Show BibTeX

@article{ Gissler2019,
author= {Christoph Gissler and Andreas Peer and Stefan Band and Jan Bender and Matthias Teschner},
title= {Interlinked SPH Pressure Solvers for Strong Fluid-Rigid Coupling},
year= {2018},
journal= {ACM Trans. Graph.},
publisher= {ACM},
issue_date = {January 2019},
volume = {38},
number = {1},
month = jan,
year = {2019},
issn = {0730-0301},
pages = {5:1--5:13},
articleno = {5},
numpages = {13},
url = {http://doi.acm.org/10.1145/3284980},
doi = {10.1145/3284980},
address = {New York, NY, USA},
}






Paul Voigtlaender, Michael Krause, Aljoša Ošep, Jonathon Luiten, Berin Balachandar Gnana Sekar, Andreas Geiger, Bastian Leibe
CVPR 2019

This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS). Towards this goal, we create dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation procedure. Our new annotations comprise 70,430 pixel masks for 1,084 distinct objects (cars and pedestrians) in 10,870 video frames. For evaluation, we extend existing multi-object tracking metrics to this new task. Moreover, we propose a new baseline method which jointly addresses detection, tracking, and segmentation with a single convolutional network. We demonstrate the value of our datasets by achieving improvements in performance when training on MOTS annotations. We believe that our datasets, metrics and baseline will become a valuable resource towards developing multi-object tracking approaches that go beyond 2D bounding boxes.

» Show BibTeX

@article{voigtlaender19arxiv,
author = {Paul Voigtlaender and Michael Krause and Aljo\u{s}a O\u{s}ep and Jonathon Luiten and Berin Balachandar Gnana Sekar and Andreas Geiger and Bastian Leibe},
title = {{MOTS}: Multi-Object Tracking and Segmentation},
journal = {arXiv preprint arXiv:1902.03604},
year = {2019},
}






Aljoša Ošep, Paul Voigtlaender, Mark Weber, Jonathon Luiten, Bastian Leibe
Arxiv:1901.09260

Many high-level video understanding methods require input in the form of object proposals. Currently, such proposals are predominantly generated with the help of networks that were trained for detecting and segmenting a set of known object classes, which limits their applicability to cases where all objects of interest are represented in the training set. This is a restriction for automotive scenarios, where unknown objects can frequently occur. We propose an approach that can reliably extract spatio-temporal object proposals for both known and unknown object categories from stereo video. Our 4D Generic Video Tubes (4D-GVT) method leverages motion cues, stereo data, and object instance segmentation to compute a compact set of video-object proposals that precisely localizes object candidates and their contours in 3D space and time. We show that given only a small amount of labeled data, our 4D-GVT proposal generator generalizes well to real-world scenarios, in which unknown categories appear. It outperforms other approaches that try to detect as many objects as possible by increasing the number of classes in the training set to several thousand.

» Show BibTeX

@article{Osep19arxiv,
author = {O\v{s}ep, Aljo\v{s}a and Voigtlaender, Paul and Weber, Mark and Luiten, Jonathon and Leibe, Bastian},
title = {4D Generic Video Object Proposals},
journal = {arXiv:1901.09260},
year = {2019}
}






Javor Kalojanov, Isaak Lim, Niloy Mitra, Leif Kobbelt
Computer Graphics Forum (Proc. EUROGRAPHICS 2019)

We propose a novel method to synthesize geometric models from a given class of context-aware structured shapes such as buildings and other man-made objects. Our central idea is to leverage powerful machine learning methods from the area of natural language processing for this task. To this end, we propose a technique that maps shapes to strings and vice versa, through an intermediate shape graph representation. We then convert procedurally generated shape repositories into text databases that in turn can be used to train a variational autoencoder which enables higher level shape manipulation and synthesis like, e.g., interpolation and sampling via its continuous latent space.





Aljoša Ošep, Paul Voigtlaender, Jonathon Luiten, Stefan Breuers, Bastian Leibe
Accepted to ICRA'19 (to appear)

This paper addresses the problem of object discovery from unlabeled driving videos captured in a realistic automotive setting. Identifying recurring object categories in such raw video streams is a very challenging problem. Not only do object candidates first have to be localized in the input images, but many interesting object categories occur relatively infrequently. Object discovery will therefore have to deal with the difficulties of operating in the long tail of the object distribution. We demonstrate the feasibility of performing fully automatic object discovery in such a setting by mining object tracks using a generic object tracker. In order to facilitate further research in object discovery, we will release a collection of more than 360'000 automatically mined object tracks from 10+ hours of video data (560'000 frames). We use this dataset to evaluate the suitability of different feature representations and clustering strategies for object discovery.

» Show BibTeX

@article{Osep19ICRA,
author = {O\v{s}ep, Aljo\v{s}a and Voigtlaender, Paul and Luiten, Jonathon and Breuers, Stefan and Leibe, Bastian},
title = {Large-Scale Object Mining for Object Discovery from Unlabeled Video},
journal = {ICRA},
year = {2019}
}





Dan Koschier, Jan Bender, Barbara Solenthaler, Matthias Teschner
Eurographics Tutorial

Graphics research on Smoothed Particle Hydrodynamics (SPH) has produced fantastic visual results that are unique across the board of research communities concerned with SPH simulations. Generally, the SPH formalism serves as a spatial discretization technique, commonly used for the numerical simulation of continuum mechanical problems such as the simulation of fluids, highly viscous materials, and deformable solids. Recent advances in the field have made it possible to efficiently simulate massive scenes with highly complex boundary geometries on a single PC. Moreover, novel techniques allow to robustly handle interactions among various materials. As of today, graphics-inspired pressure solvers, neighborhood search algorithms, boundary formulations, and other contributions often serve as core components in commercial software for animation purposes as well as in computer-aided engineering software.

This tutorial covers various aspects of SPH simulations. Governing equations for mechanical phenomena and their SPH discretizations are discussed. Concepts and implementations of core components such as neighborhood search algorithms, pressure solvers, and boundary handling techniques are presented. Implementation hints for the realization of SPH solvers for fluids, elastic solids, and rigid bodies are given. The tutorial combines the introduction of theoretical concepts with the presentation of actual implementations.




Andrea Bönsch, Jan Hoffmann, Jonathan Wendt, Torsten Wolfgang Kuhlen
To be presented at: IEEE Virtual Humans and Crowds for Immersive Environments (VHCIE), 2019

When designing the behavior of embodied, computer-controlled, human-like virtual agents (VA) serving as temporarily required assistants in virtual reality applications, two linked factors have to be considered: the time the VA is visible in the scene, defined as presence time (PT), and the time till the VA is actually available for support on a user’s calling, defined as approaching time (AT).

Complementing a previous research on behaviors with a low VA’s PT, we present the results of a controlled within-subjects study investigating behaviors by which the VA is always visible, i.e., behaviors with a high PT. The two behaviors affecting the AT tested are: following, a design in which the VA is omnipresent and constantly follows the users, and busy, a design in which theVAis self-reliantly spending time nearby the users and approaches them only if explicitly asked for. The results indicate that subjects prefer the following VA, a behavior which also leads to slightly lower execution times compared to busy.

» Show BibTeX

@InProceedings{Boensch2019c,
author = {Andrea B\"{o}nsch and Jan Hoffmann and Jonathan Wendt and Torsten W. Kuhlen},
title = {{Evaluation of Omnipresent Virtual Agents Embedded as Temporarily Required Assistants in Immersive Environments}},
booktitle = {IEEE Virtual Humans and Crowds for Immersive Environments},
year = {2019}
}






Andrea Bönsch, Alexander Kies, Moritz Jörling, Stefanie Paluch, Torsten Wolfgang Kuhlen
To be presented at: IEEE Virtual Humans and Crowds for Immersive Environments (VHCIE), 2019

Technological innovations have a growing relevance for charitable donations, as new technologies shape the way we perceive and approach digital media. In a between-subjects study with sixty-one volunteers, we investigated whether a higher degree of immersion for the potential donor can yield more donations for non-governmental organizations. Therefore, we compared the donations given after experiencing a video-based, an augmented-reality-based, or a virtual-reality-based scenery with a virtual agent, representing a war victimized Syrian boy talking about his losses. Our initial results indicate that the immersion has no impact. However, the donor’s perceived innovativeness of the used technology might be an influencing factor.

» Show BibTeX

@InProceedings{Boensch2019b,
author = {Andrea B\"{o}nsch and Alexander Kies and Moritz Jörling and Stefanie Paluch and Torsten W. Kuhlen},
title = {{An Empirical Lab Study Investigating If Higher Levels of Immersion Increase the Willingness to Donatee}},
booktitle = {IEEE Virtual Humans and Crowds for Immersive Environments},
year = {2019}
}






Andrea Bönsch, Andrew Feng, Parth Patel, Ari Shapiro
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019)

Volumetric video can be used in virtual and augmented reality applications to show detailed animated performances by human actors. In this paper, we describe a volumetric capture system based on a photogrammetry cage with unsynchronized, low-cost cameras which is able to generate high-quality geometric data for animated avatars. This approach requires, inter alia, a subsequent synchronization of the captured videos.




» Show BibTeX

@Article{Boensch2019a,
author = {Andrea Bönsch, Andrew Feng, Parth Patel and Ari Shapiro},
title = {{Volumetric Video Capture using Unsynchronized, Low-cost Cameras}},
journal = {14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019)},
year = {2019},
volume = {1},
pages = {255--261}
}






Andrea Bönsch
To be presented at: Doctoral Consortium at IEEE Virtual Reality Conference 2019

Many applications in the realm of social virtual reality require reasonable locomotion patterns for their embedded, intelligent virtual agents (VAs). The two main research areas covered in the literature are pure inter-agent-dynamics for crowd simulations and user-agent-dynamics in, e.g., pedestrian scenarios. However, social locomotion, defined as a joint locomotion of a social group consisting of a human user and one to several VAs in the role of accompanying interaction partners, has not been carefully investigated yet. I intend to close this gap by contributing locomotion models for the social group’s VAs. Thereby, I plan to evaluate the effects of the VAs’ locomotion patterns on a user’s perceived degree of immersion, comfort, and social presence.

» Show BibTeX

@InProceedings{Boensch2019d,
author = {Andrea B\"{o}nsch},
title = {Locomotion with Virtual Agents in the Realm of Social Virtual Reality},
booktitle = {Doctoral Consortium at IEEE Virtual Reality Conference 2018},
year = {2019}
}






Previous Year (2018)
Disclaimer Home Visual Computing institute RWTH Aachen University