Publications
Polygonal Boundary Evaluation of Minkowski Sums and Swept Volumes

We present a novel technique for the efficient boundary evaluation of sweep operations applied to objects in polygonal boundary representation. These sweep operations include Minkowski addition, offsetting, and sweeping along a discrete rigid motion trajectory. Many previous methods focus on the construction of a polygonal superset (containing self-intersections and spurious internal geometry) of the boundary of the volumes which are swept. Only few are able to determine a clean representation of the actual boundary, most of them in a discrete volumetric setting. We unify such superset constructions into a succinct common formulation and present a technique for the robust extraction of a polygonal mesh representing the outer boundary, i.e. it makes no general position assumptions and always yields a manifold, watertight mesh. It is exact for Minkowski sums and approximates swept volumes polygonally. By using plane-based geometry in conjunction with hierarchical arrangement computations we avoid the necessity of arbitrary precision arithmetics and extensive special case handling. By restricting operations to regions containing pieces of the boundary, we significantly enhance the performance of the algorithm.
A WebService employing this method is available.
Two-Colored Pixels

In this paper we show how to use two-colored pixels as a generic tool for image processing. We apply two-colored pixels as a basic operator as well as a supporting data structure for several image processing applications. Traditionally, images are represented by a regular grid of square pixels with one constant color each. In the two-colored pixel representation, we reduce the image resolution and replace blocks of NxN pixels by one square that is split by a (feature) line into two regions with constant colors. We show how the conversion of standard mono-colored pixel images into two-colored pixel images can be computed efficiently by applying a hierarchical algorithm along with a CUDA-based implementation. Two-colored pixels overcome some of the limitations that classical pixel representations have, and their feature lines provide minimal geometric information about the underlying image region that can be effectively exploited for a number of applications. We show how to use two-colored pixels as an interactive brush tool, achieving realtime performance for image abstraction and non-photorealistic filtering. Additionally, we propose a realtime solution for image retargeting, defined as a linear minimization problem on a regular or even adaptive two-colored pixel image. The concept of two-colored pixels can be easily extended to a video volume, and we demonstrate this for the example of video retargeting.
Ad-Hoc Multi-Displays for Mobile Interactive Applications

We present a framework which enables the combination of different mobile devices into one multi-display such that visual content can be shown on a larger area consisting, e.g., of several mobile phones placed arbitrarily on the table. Our system allows the user to perform multi-touch interaction metaphors, even across different devices, and it guarantees the proper synchronization of the individual displays with low latency. Hence from the user’s perspective the heterogeneous collection of mobile devices acts like one single display and input device. From the system perspective the major technical and algorithmic challenges lie in the co-calibration of the individual displays and in the low latency synchronization and communication of user events. For the calibration we estimate the relative positioning of the displays by visual object recognition and an optional manual calibration step.
Exact and Robust (Self-)Intersections for Polygonal Meshes

We present a new technique to implement operators that modify the topology of polygonal meshes at intersectionsand self-intersections. Depending on the modification strategy, this effectively results in operators for Boolean combinations or for the construction of outer hulls that are suited for mesh repair tasks and accurate meshbased front tracking of deformable materials that split and merge. By combining an adaptive octree with nested binary space partitions (BSP), we can guarantee exactness (= correctness) and robustness (= completeness) of the algorithm while still achieving higher performance and less memory consumption than previous approaches. The efficiency and scalability in terms of runtime and memory is obtained by an operation localization scheme. We restrict the essential computations to those cells in the adaptive octree where intersections actually occur. Within those critical cells, we convert the input geometry into a plane-based BSP-representation which allows us to perform all computations exactly even with fixed precision arithmetics. We carefully analyze the precision requirements of the involved geometric data and predicates in order to guarantee correctness and show how minimal input mesh quantization can be used to safely rely on computations with standard floating point numbers. We properly evaluate our method with respect to precision, robustness, and efficiency.
A WebService employing this method is available.
Efficient Rasterization for Outdoor Radio Wave Propagation

Conventional beam tracing can be used for solving global illumination problems. It is an efficient algorithm, and performs very well when implemented on the GPU. This allows us to apply the algorithm in a novel way to the problem of radio wave propagation. The simulation of radio waves is conceptually analogous to the problem of light transport. We use a custom, parallel rasterization pipeline for creation and evaluation of the beams. We implement a subset of a standard 3D rasterization pipeline entirely on the GPU, supporting 2D and 3D framebuffers for output. Our algorithm can provide a detailed description of complex radio channel characteristics like propagation losses and the spread of arriving signals over time (delay spread). Those are essential for the planning of communication systems required by mobile network operators. For validation, we compare our simulation results with measurements from a real world network. Furthermore, we account for characteristics of different propagation environments and estimate the influence of unknown components like traffic or vegetation by adapting model parameters to measurements.
Image Synthesis for Branching Structures

We present a set of techniques for the synthesis of artificial images that depict branching structures like rivers, cracks, lightning, mountain ranges, or blood vessels. The central idea is to build a statistical model that captures the characteristic bending and branching structure from example images. Then a new skeleton structure is synthesized and the final output image is composed from image fragments of the original input images. The synthesis part of our algorithm runs mostly automatic but it optionally allows the user to control the process in order to achieve a specific result. The combination of the statistical bending and branching model with sophisticated fragment-based image synthesis corresponds to a multi-resolution decomposition of the underlying branching structure into the low frequency behavior (captured by the statistical model) and the high frequency detail (captured by the image detail in the fragments). This approach allows for the synthesis of realistic branching structures, while at the same time preserving important textural details from the original image.
Automatic Registration of Oblique Aerial Images with Cadastral Maps

In recent years, oblique aerial images of urban regions have become increasingly popular for 3D city modeling, texturing, and various cadastral applications. In contrast to images taken vertically to the ground, they provide information on building heights, appearance of facades, and terrain elevation. Despite their widespread availability for many cities, the processing pipeline for oblique images is not fully automatic yet. Especially the process of precisely registering oblique images with map vector data can be a tedious manual process. We address this problem with a registration approach for oblique aerial images that is fully automatic and robust against discrepancies between map and image data. As input, it merely requires a cadastral map and an arbitrary number of oblique images. Besides rough initial registrations usually available from GPS/INS measurements, no further information is required, in particular no information about the terrain elevation.
Generalized Use of Non-Terminal Symbols for Procedural Modeling

We present the new procedural modeling language G² (Generalized Grammar) which adapts various concepts from general purpose programming languages in order to provide high descriptive power with well-defined semantics and a simple syntax which is easily readable even by non-programmers. We extend the scope of previous architectural modeling languages by allowing for multiple types of non-terminal objects with domain-specific operators and attributes. The language accepts non-terminal symbols as parameters in modeling rules and thus enables the definition of abstract structure templates for flexible re-use within the grammar. To identify specific scene parts or objects, we introduce flags which are Boolean values whose scope covers an entire subtree in the scenegraph. The rigorous handling of typed parameters which are locally declared within the rules prevents inconsistent states emerging from not or wrongly declared variables. By deriving G² from the well-established programming language Python, we can make sure that our modeling language has a well-defined semantics. For illustration, we apply G² to architectural as well as plant modeling in order to demonstrate its descriptive power with some complex examples.
We also provide a Python prototype related to this paper for an easy integration of our system into the Houdini modeling framework from SideFX software. It is available on the project page.
Hybrid Booleans
Talk at Eurographics 2011

In this paper we present a novel method to compute Boolean operation polygonal meshes. Given a Boolean expression over an arbitrary number input meshes we reliably and efficiently compute an output mesh which faithfully preserves the existing sharp features and precisely reconstructs the new features appearing along the intersections of the input meshes. The term "hybrid" applies to our method in two ways: First, our algorithm operates on a hybrid data structure which stores the original input polygons (surface data) in an adaptively refined octree (volume data). By this we combine the robustness of volumetric techniques with the accuracy of surface-oriented techniques. Second, we generate a new triangulation only in a close vicinity around the intersections of the input meshes and thus preserve as much of the original mesh structure as possible (hybrid mesh). Since the actual processing of the Boolean operation is confined to a very small region around the intersections of the input meshes, we can achieve very high adaptive refinement resolutions and hence very high precision. We demonstrate our method on a number of challenging examples.
3D Sketch Recognition for Interaction in Virtual Environments

We present a comprehensive 3D sketch recognition framework for interaction within Virtual Environments that allows to trigger commands by drawing symbols, which are recognized by a multi-level analysis. It proceeds in three steps: The segmentation partitions each input line into meaningful segments, which are then recognized as a primitive shape, and finally analyzed as a whole sketch by a symbol matching step. The whole framework is configurable over well-defined interfaces, utilizing a fuzzy logic algorithm for primitive shape learning and a textual description language to define compound symbols. It allows an individualized interaction approach that can be used without much training and provides a good balance between abstraction and intuition. We show the real-time applicability of our approach by performance measurements.
@inproceedings {PE:vriphys:vriphys10:115-124,
booktitle = {Workshop in Virtual Reality Interactions and Physical Simulation "VRIPHYS" (2010)},
editor = {Kenny Erleben and Jan Bender and Matthias Teschner},
title = {{3D} Sketch Recognition for Interaction in Virtual Environments},
author = {Rausch, Dominik and Assenmacher, Ingo and Kuhlen, Torsten},
year = {2010},
publisher = {The Eurographics Association},
DOI = {10.2312/PE/vriphys/vriphys10/115-124}
}
Virtual Reality System at RWTH Aachen University

During the last decade, Virtual Reality (VR) systems have progressed from primary laboratory experiments into serious and valuable tools. Thereby, the amount of useful applications has grown in a large scale, covering conventional use, e.g., in science, design, medicine and engineering, as well as more visionary applications such as creating virtual spaces that aim to act real. However, the high capabilities of today’s virtual reality systems are mostly limited to firstclass visual rendering, which directly disqualifies them for immersive applications. For general application, though, VR-systems should feature more than one modality in order to boost its range of applications. The CAVE-like immersive environment that is run at RWTH Aachen University comprises state-of-the-art visualization and auralization with almost no constraints on user interaction. In this article a summary of the concept, the features and the performance of our VR-system is given. The system features a 3D sketching interface that allows controlling the application in a very natural way by simple gestures. The sound rendering engine relies on present-day knowledge of Virtual Acoustics and enables a physically accurate simulation of sound propagation in complex environments, including important wave effects such as sound scattering, airborne sound insulation between rooms and sound diffraction. In spite of this realistic sound field rendering, not only spatially distributed and freely movable sound sources and receivers are supported, but also modifications and manipulations of the environment itself. The auralization concept is founded on pure FIR filtering which is realized by highly parallelized non-uniformly partitioned convolutions. A dynamic crosstalk cancellation system performs the sound reproduction that delivers binaural signals to the user without the need of headphones. The significant computational complexity is handled by distributed computation on PCclusters that drive the simulation in real-time even for huge audio-visual scenarios.
@inproceedings{schroder2010virtual,
title={Virtual reality system at RWTH Aachen University},
author={Schr{\"o}der, Dirk and Wefers, Frank and Pelzer, S{\"o}nke and Rausch, Dominik and Vorl{\"a}nder, Michael and Kuhlen, Torsten},
booktitle={Proceedings of the international symposium on room acoustics (ISRA), Melbourne, Australia},
year={2010}
}
Ein Framework für Geometrieverarbeitung basierend auf hybriden Oberflächendarstellungen

We present a framework that allows for the composition of custom-tailored data structures for hybrid representation of geometry and supports the development of associated geometry processing methods. Besides others, a novel hybrid approach for the evaluation of Boolean expressions on polygon meshes is elaborated in this context. By relying on the hybrid geometry information it is – in contrast to previous methods – able to perform such operations robustly as well as accurately.
Character Reconstruction and Animation from Uncalibrated Video

We present a novel method to reconstruct 3D character models from video. The main conceptual contribution is that the reconstruction can be performed from a single uncalibrated video sequence which shows the character in articulated motion. We reduce this generalized problem setting to the easier case of multi-view reconstruction of a rigid scene by applying pose synchronization of the character between frames. This is enabled by two central technical contributions. First, based on a generic character shape template, a new mesh-based technique for accurate shape tracking is proposed. This method successfully handles the complex occlusions issues, which occur when tracking the motion of an articulated character. Secondly, we show that image-based 3D reconstruction becomes possible by deforming the tracked character shapes as-rigid-as-possible into a common pose using motion capture data. After pose synchronization, several partial reconstructions can be merged in order to create a single, consistent 3D character model. We integrated these components into a simple interactive framework, which allows for straightforward generation and animation of 3D models for a variety of character shapes from uncalibrated monocular video.
Motion Estimating Device

A motion estimating device first detects mobile objects Oi and Oi' in continuous image frames T and T', and acquires image areas Ri and Ri' corresponding to the mobile objects Oi and Oi'. Then, the motion estimating device removes the image areas Ri and Ri' corresponding to the mobile objects Oi and Oi' in the image frames T and T', extracts corresponding point pairs Pj of feature points between the image frames T and T' from the image areas having removed the image areas Ri and Ri', and carries out the motion estimation of the autonomous mobile machine between the image frames T and T' on the basis of the positional relationship of the corresponding point pairs Pj of feature points.
@misc{ess2012motion,
title={Motion estimating device},
author={Ess, A. and Leibe, B. and Schindler, K. and Van Gool, L. and Kitahama, K. and Funayama, R.},
url={http://www.google.com/patents/US8213684},
year={2012},
publisher={Google Patents},
note={US Patent 8,213,684}
}
Object Detection and Tracking for Autonomous Navigation in Dynamic Environments

We address the problem of vision-based navigation in busy inner-city locations, using a stereo rig mounted on a mobile platform. In this scenario semantic information becomes important: rather than mod- elling moving objects as arbitrary obstacles, they should be categorised and tracked in order to predict their future behaviour. To this end, we combine classical geometric world mapping with object category detection and tracking. Object-category specific detectors serve to find instances of the most important object classes (in our case pedestrians and cars). Based on these detections, multi-object tracking recovers the objects’ trajectories, thereby making it possible to predict their future locations, and to employ dynamic path planning. The approach is evaluated on challenging, realistic video sequences recorded at busy inner-city locations.
@article{ess2010object,
title={Object detection and tracking for autonomous navigation in dynamic environments},
author={Ess, Andreas and Schindler, Konrad and Leibe, Bastian and Van Gool, Luc},
journal={The International Journal of Robotics Research},
volume={29},
number={14},
pages={1707--1725},
year={2010},
}
Multi-Person Tracking with Sparse Detection and Continuous Segmentation

This paper presents an integrated framework for mobile street-level tracking of multiple persons. In contrast to classic tracking-by-detection approa- ches, our framework employs an efficient level-set tracker in order to follow indi- vidual pedestrians over time. This low-level tracker is initialized and periodically updated by a pedestrian detector and is kept robust through a series of consis- tency checks. In order to cope with drift and to bridge occlusions, the resulting tracklet outputs are fed to a high-level multi-hypothesis tracker, which performs longer-term data association. This design has the advantage of simplifying short- term data association, resulting in higher-quality tracks that can be maintained even in situations where the pedestrian detector does no longer yield good de- tections. In addition, it requires the pedestrian detector to be active only part of the time, resulting in computational savings. We quantitatively evaluate our ap- proach on several challenging sequences and show that it achieves state-of-the-art performance.
@incollection{mitzel2010multi,
title={Multi-person tracking with sparse detection and continuous segmentation},
author={Mitzel, Dennis and Horbert, Esther and Ess, Andreas and Leibe, Bastian},
booktitle={ECCV},
pages={397--410},
year={2010},
}
Geometrically Constrained Level-Set Tracking for Automotive Applications

We propose a new approach for integrating geometric scene knowledge into a level-set tracking framework. Our approach is based on a novel constrained-homography transformation model that restricts the deformation space to physically plausible rigid motion on the ground plane. This model is especially suitable for tracking vehicles in automo- tive scenarios. Apart from reducing the number of parameters in the estimation, the 3D transformation model allows us to obtain additional information about the tracked objects and to recover their detailed 3D motion and orientation at every time step. We demonstrate how this in- formation can be used to improve a Kalman filter estimate of the tracked vehicle dynamics in a higher-level tracker, leading to more accurate ob- ject trajectories. We show the feasibility of this approach for an applica- tion of tracking cars in an inner-city scenario.
@incollection{horbert2010geometrically,
title={Geometrically constrained level set tracking for automotive applications},
author={Horbert, Esther and Mitzel, Dennis and Leibe, Bastian},
booktitle={Pattern Recognition},
pages={472--482},
year={2010},
}
An Evaluation of Two Automatic Landmark Building Discovery Algorithms for City Reconstruction

An important part of large-scale city reconstruction systems is an im- age clustering algorithm that divides a set of images into groups that should cover only one building each. Those groups then serve as input for structure from mo- tion systems. A variety of approaches for this mining step have been proposed recently, but there is a lack of comparative evaluations and realistic benchmarks. In this work, we want to fill this gap by comparing two state-of-the-art landmark mining algorithms: spectral clustering and min-hash. Furthermore, we introduce a new large-scale dataset for the evaluation of landmark mining algorithms con- sisting of 500k images from the inner city of Paris. We evaluate both algorithms on the well-known Oxford dataset and our Paris dataset and give a detailed com- parison of the clustering quality and computation time of the algorithms.
@incollection{weyand2010evaluation,
title={An evaluation of two automatic landmark building discovery algorithms for city reconstruction},
author={Weyand, Tobias and Hosang, Jan and Leibe, Bastian},
booktitle={ECCV Workshop},
pages={310--323},
year={2010},
}
Incremental Model Selection for Detection and Tracking of Planar Surfaces

Man-made environments are abundant with planar surfaces which have attractive properties and are a prerequisite for a variety of vision tasks. This paper presents an incremental model selection method to detect piecewise planar surfaces, where planes once detected are tracked and serve as priors in subsequent images. The novelty of this approach is to formalize model selection for plane detection with Minimal Description Length (MDL) in an incremental manner. In each iteration tracked planes and new planes computed from randomly sampled interest points are evaluated, the hypotheses which best explain the scene are retained, and their supporting points are marked so that in the next iteration random sampling is guided to unexplained points. Hence, the remaining finer scene details can be represented. We show in a quantitative evaluation that this new method competes with state of the art algorithms while it is more flexible to incorporate prior knowledge from tracking.
@inproceedings{prankl10incremental,
title = {Incremental Model Selection for Detection and Tracking of Planar Surfaces},
author = {Prankl, Johann and Zillich, Michael and Leibe, Bastian and Vincze, Markus},
year = {2010},
booktitle = {BMVC},
}
Automatic Detection and Tracking of Pedestrians from a Moving Stereo Rig

We report on a stereo system for 3D detection and tracking of pedestrians in urban traffic scenes. The system is built around a probabilistic environment model which fuses evidence from dense 3D reconstruction and image-based pedestrian detection into a consistent interpretation of the observed scene, and a multi-hypothesis tracker to reconstruct the pedestrians’ trajectories in 3D coordinates over time. Experiments on real stereo sequences recorded in busy inner-city scenarios are presented, in which the system achieves promising results.
@article{schindler2010automatic,
title={Automatic detection and tracking of pedestrians from a moving stereo rig},
author={Schindler, Konrad and Ess, Andreas and Leibe, Bastian and Van Gool, Luc},
journal={ISPRS Journal of Photogrammetry and Remote Sensing},
volume={65},
number={6},
pages={523--537},
year={2010},
}
Virtual Texturing

In this thesis a rendering system and an accompanying tool chain for Virtual Texturing is presented. Our tools allow to automatically retexture existing geometry in order to apply unique texturing on each face. Furthermore we investigate several techniques that try to minimize visual artifacts in the case that only a small amount of pages can be streamed per frame.We analyze the influence of different heuristics that are responsible for the page selection. Alongside these results we present a measurement method to allow the comparison of our heuristics.
Previous Year (2009)