Publications
Mixed-Integer Quadrangulation
Proceedings of the 2009 SIGGRAPH Conference

We present a novel method for quadrangulating a given triangle mesh. After constructing an as smooth as possible symmetric cross field satisfying a sparse set of directional constraints (to capture the geometric structure of the surface), the mesh is cut open in order to enable a low distortion unfolding. Then a seamless globally smooth parametrization is computed whose iso-parameter lines follow the cross field directions. In contrast to previous methods, sparsely distributed directional constraints are sufficient to automatically determine the appropriate number, type and position of singularities in the quadrangulation. Both steps of the algorithm (cross field and parametrization) can be formulated as a mixed-integer problem which we solve very efficiently by an adaptive greedy solver. We show several complex examples where high quality quad meshes are generated in a fully automatic manner.
The Constrained Mixed-Integer Solver used in this project has been released under GPL and can be found on its projects page.
SCRAMSAC: Improving RANSAC's Efficiency with a Spatial Consistency Filter

Geometric verification with RANSAC has become a crucial step for many local feature based matching applications. Therefore, the details of its implementation are directly relevant for an application's run-time and the quality of the estimated results. In this paper, we propose a RANSAC extension that is several orders of magnitude faster than standard RANSAC and as fast as and more robust to degenerate configurations than PROSAC, the currently fastest RANSAC extension from the literature. In addition, our proposed method is simple to implement and does not require parameter tuning. Its main component is a spatial consistency check that results in a reduced correspondence set with a significantly increased inlier ratio, leading to faster convergence of the remaining estimation steps. In addition, we experimentally demonstrate that RANSAC can operate entirely on the reduced set not only for sampling, but also for its consensus step, leading to additional speed-ups. The resulting approach is widely applicable and can be readily combined with other extensions from the literature. We quantitatively evaluate our approach's robustness on a variety of challenging datasets and compare its performance to the state-of-the-art.
Simulation of Radio Wave Propagation by Beam Tracing

Beam tracing can be used for solving global illumination problems. It is an efficient algorithm, and performs very well when implemented on the GPU. This allows us to apply the algorithm in a novel way to the problem of radio wave propagation. The simulation of radio waves is conceptually analogous to the problem of light transport. However, their wavelengths are of proportions similar to that of the environment. At such frequencies, waves that bend around corners due to diffraction are becoming an important propagation effect. In this paper we present a method which integrates diffraction, on top of the usual effects related to global illumination like reflection, into our beam tracing algorithm. We use a custom, parallel rasterization pipeline for creation and evaluation of the beams. Our algorithm can provide a detailed description of complex radio channel characteristics like propagation losses and the spread of arriving signals over time (delay spread). Those are essential for the planning of communication systems required by mobile network operators. For validation, we compare our simulation results with measurements from a real world network.
Robust Tracking-by-Detection Using a Detector Confidence Particle Filter

We propose a novel approach for multi-person tracking-by-detection in a particle filtering framework. In addition to final high-confidence detections, our algorithm uses the continuous confidence of pedestrian detectors and online trained, instance-specific classifiers as a graded observation model. Thus, generic object category knowledge is complemented by instance-specific information. A main contribution of this paper is the exploration of how these unreliable information sources can be used for multi-person tracking. The resulting algorithm robustly tracks a large number of dynamically moving persons in complex scenes with occlusions, does not rely on background modeling, and operates entirely in 2D (requiring no camera or ground plane calibration). Our Markovian approach relies only on information from the past and is suitable for online applications. We evaluate the performance on a variety of datasets and show that it improves upon state-of-the-art methods.
Feature-Centric Efficient Subwindow Search

Many object detection systems rely on linear classifiers embedded in a sliding-window scheme. Such exhaustive search involves massive computation. Efficient Subwindow Search (ESS) [11] avoids this by means of branch and bound. However, ESS makes an unfavourable memory tradeoff. Memory usage scales with both image size and overall object model size. This risks becoming prohibitive in a multiclass system. In this paper, we make the connection between sliding-window and Hough-based object detection explicit. Then, we show that the feature-centric view of the latter also nicely fits with the branch and bound paradigm, while it avoids the ESS memory tradeoff. Moreover, on-line integral image calculations are not needed. Both theoretical and quantitative comparisons with the ESS bound are provided, showing that none of this comes at the expense of performance.
Markerless Reconstruction of Dynamic Facial Expressions

In this paper we combine methods from the field of computer vision with surface editing techniques to generate animated faces, which are all in full correspondence to each other. The input for our system are synchronized video streams from multiple cameras. The system produces a sequence of triangle meshes with fixed connectivity, representing the dynamics of the captured face. By carfully taking all requirements and characteristics into account we decided for the proposed system design: We deform an initial face template using movements estimated from the video streams. To increase the robustness of the initial reconstruction, we use a morphable model as a shape prior. However using an efficient Surfel Fitting technique, we are still able to precisely capture face shapes not part of the PCA Model. In the deformation stage, we use a 2D mesh-based tracking approach to establish correspondences in time. We then reconstruct image-samples in 3D using the same Surfel Fitting technique, and finally use the reconstructed points to robustly deform the initially reconstructed face.
Volume Conserving Simulation of Deformable Bodies

We present a new method for simulating volume conserving deformable bodies using an impulse-based approach. In order to simulate a deformable body a tetrahedral model is generated from an arbitrary triangle mesh. All resulting tetrahedrons are assigned to volume constraints which ensure the conservation of the total volume. For the simulation of such a constraint impulses are computed and applied to the particles of the assigned tetrahedrons. The algorithm is easy to implement and ensures exact volume conservation in each simulation step.
@inproceedings{Diziol09,
author = {Raphael Diziol and Jan Bender and Daniel Bayer},
title = {Volume Conserving Simulation of Deformable Bodies},
booktitle = {Short Paper Proceedings of Eurographics},
year = {2009},
month = mar,
address = {Munich (Germany)}
}
Using Multi-View Recognition and Meta-data Annotation to Guide a Robot's Attention

In the transition from industrial to service robotics, robots will have to deal with increasingly unpredictable and variable environments. We present a system that is able to recognize objects of a certain class in an image and to identify their parts for potential interactions. The method can recognize objects from arbitrary viewpoints and generalizes to instances that have never been observed during training, even if they are partially occluded and appear against cluttered backgrounds. Our approach builds on the Implicit Shape Model of Leibe et al. (2008). We extend it to couple recognition to the provision of meta-data useful for a task and to the case of multiple viewpoints by integrating it with the dense multi-view correspondence finder of Ferrari et al. (2006). Meta-data can be part labels but also depth estimates, information on material types, or any other pixelwise annotation. We present experimental results on wheelchairs, cars, and motorbikes.
Robust Multi-Person Tracking from a Mobile Platform

In this paper, we address the problem of multi-person tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution that extracts as much visual information as possible and combines it through cognitive feedback cycles. We propose such an approach, which jointly estimates camera position, stereo depth, object detection, and tracking. The interplay between those components is represented by a graphical model. Since the model has to incorporate object-object interactions and temporal links to past frames, direct inference is intractable. We therefore propose a two-stage procedure: for each frame we first solve a simplified version of the model (disregarding interactions and temporal continuity) to estimate the scene geometry and an overcomplete set of object detections. Conditioned on these results, we then address object interactions, tracking, and prediction in a second step. The approach is experimentally evaluated on several long and difficult video sequences from busy inner-city locations. Our results show that the proposed integration makes it possible to deliver robust tracking performance in scenes of realistic complexity.
An Intuitive Interface for Interactive High Quality Image-Based Modeling
(Proc. of Pacific Graphics 2009)

We present the design of an interactive image-based modeling tool that enables a user to quickly generate detailed 3D models with texture from a set of calibrated input images. Our main contribution is an intuitive user interface that is entirely based on simple 2D painting operations and does not require any technical expertise by the user or difficult pre-processing of the input images. One central component of our tool is a GPU-based multi-view stereo reconstruction scheme, which is implemented by an incremental algorithm, that runs in the background during user interaction so that the user does not notice any significant response delay.
GIzMOs: Genuine Image Mosaics with Adaptive Tiling

We present a method which splits an input image into a set of tiles. Each tile is then replaced by another image from a large database such that, when viewed from a distance, the original image is reproduced as well as possible. While the general concept of image mosaics is not new, we consider our results as "genuine image mosaics" (or short GIzMOs) in the sense that the images from the database are not modified in any way. This is different from previous work, where the image tiles are usually color shifted or overlaid with the high-frequency content of the input image. Besides the regular alignment of the tiles we propose a greedy approach for adaptive tiling where larger tiles are placed in homogenous image regions. By this we avoid the visual periodicity, which is induced by the equal spacing of the image tiles in the completely regular setting. Our overall system addresses also the cleaning of the image database by removing all unwanted images with no meaningful content. We apply differently sophisticated image descriptors to find the best matching image for each tile. For esthetic and artistic reasons we classify each tile as "feature" or "non-feature" and then apply a suitable image descriptor. In a user study we have verified that our descriptors lead to mosaics that are significantly better recognizable than just taking, e.g., average color values.
A WebService employing this method is available.
Simulating Almost Incompressible Deformable Objects

We present a new method for simulating almost incompressible deformable objects. A tetrahedral model is used to represent and restore the volume during the simulation. The new constraint computes impulses in the onering of each vertex of the tetrahedral model, in order to conserve the initial volume. With different parameters, the presented method can handle a large variety of different deformation behaviors, ranging from stiff to large deformations and even plastic deformations. The algorithm is easy to implement and reduces the volume error to less than 1% in most situations, even when large deformations are applied.
@inproceedings{Diziol09,
author = {Raphael Diziol and Daniel Bayer and Jan Bender},
title = {Simulating Almost Incompressible Deformable Objects},
booktitle = {Virtual Reality Interactions and Physical Simulations (VRIPhys)},
year = {2009},
month = nov,
address = {Karlsruhe (Germany)},
pages = {31-37}
}
Optimized impulse-based dynamic simulation

The impulse-based dynamic simulation is a recent method to compute physically based simulations. It supports the simulation of rigid-bodies and particles connected by all kinds of implicit constraints. In recent years the impulse-based dynamic simulation has been more and more used to simulate deformable bodies as well. These simulations create new requirements for the runtime of the method because very large systems of connected particles have to be simulated to get results of high quality. In this paper several runtime optimizations for the impulse-based dynamic simulation are presented. They allow to compute the same simulations at a fraction of time needed for the original method. Therefore, larger systems or simulations with increased accuracy can be simulated in realtime.
@inproceedings{Bayer09,
author = {Daniel Bayer and Raphael Diziol and Jan Bender},
title = {Optimized Impulse-Based Dynamic Simulation},
booktitle = {Virtual Reality Interactions and Physical Simulations (VRIPhys)},
year = {2009},
month = nov,
address = {Karlsruhe (Germany)},
pages = {125-133}
}
Dynamic simulation of inextensible cloth

In this paper an impulse-based method for cloth simulation is presented. The simulation of cloth is required in different application areas like computer animation, virtual reality or computer games. Simulation methods often assume that cloth is an elastic material. With this assumption the simulation can be performed very efficiently using spring forces. The problem is that many textiles cannot be stretched significantly. A realistic simulation of these textiles with spring forces leads to stiff differential equations which cause a deterioration of performance. The impulse-based method described in this paper solves this problem and allows the realistic simulation of inelastic textiles.
@article{Bender2009,
author = {Jan Bender and Daniel Bayer and Raphael Diziol},
title = {Dynamic simulation of inextensible cloth},
journal = {IADIS International Journal on Computer Science and Information Systems},
volume = {4},
number = {2},
year = {2009},
pages = {86--102}
}
PRISM: PRincipled Implicit Shape Model

This paper addresses the problem of object detection by means of the Generalised Hough transform paradigm. The Implicit Shape Model (ISM) is a well-known approach based on this idea. It made this paradigm popular and has been adopted many times. Although the algorithm exhibits robust detection performance, its description, i.e. its probabilistic model, involves arguments which are unsatisfactory from a probabilistic standpoint. We propose a framework which overcomes these problems and gives a sound justification to the voting procedure. Furthermore, our framework allows for a formal understanding of the heuristic of soft-matching commonly used in visual vocabulary systems. We show that it is sufficient to use soft-matching during learning only and to perform fast nearest neighbour matching at recognition time (where speed is of prime importance). Our implementation is based on Gaussian Mixture Models (instead of kernel density estimators as with ISM) which lead to a fast gradient-based object detector.
Shape-from-Recognition: Recognition Enables Meta-Data Transfer

Low-level cues in an image not only allow to infer higher-level information like the presence of an object, but the inverse is also true. Category-level object recognition has now reached a level of maturity and accuracy that allows to successfully feed back its output to other processes. This is what we refer to as cognitive feedback. In this paper, we study one particular form of cognitive feedback, where the ability to recognize objects of a given category is exploited to infer different kinds of meta-data annotations for images of previously unseen object instances, in particular information on 3D shape. Meta-data can be discrete, real- or vector-valued. Our approach builds on the Implicit Shape Model of Leibe and Schiele [1], and extends it to transfer annotations from training images to test images. We focus on the inference of approximative 3D shape information about objects in a single 2D image. In experiments, we illustrate how our method can infer depth maps, surface normals and part labels for previously unseen object instances.
Moving Obstacle Detection in Highly Dynamic Scenes

We address the problem of vision-based multi-person tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. Specifically, we are interested in the application of such a system for supporting path planning algorithms in the avoidance of dynamic obstacles. The complexity of the problem calls for an integrated solution, which extracts as much visual information as possible and combines it through cognitive feedback. We propose such an approach, which jointly estimates camera position, stereo depth, object detections, and trajectories based only on visual information. The interplay between these components is represented in a graphical model. For each frame, we first estimate the ground surface together with a set of object detections. Based on these results, we then address object interactions and estimate trajectories. Finally, we employ the tracking results to predict future motion for dynamic objects and fuse this information with a static occupancy map estimated from dense stereo. The approach is experimentally evaluated on several long and challenging video sequences from busy inner-city locations recorded with different mobile setups. The results show that the proposed integration makes stable tracking and motion prediction possible, and thereby enables path planning in complex and highly dynamic scenes.
A Sketching Interface for Feature Curve Recovery of Free-Form Surfaces

In this paper, we present a semi-automatic approach to efficiently and robustly recover the characteristic feature curves of a given free-form surface. The technique supports a sketch-based interface where the user just has to roughly sketch the location of a feature by drawing a stroke directly on the input mesh. The system then snaps this initial curve to the correct position based on a graph-cut optimization scheme that takes various surface properties into account. Additional position constraints can be placed and modified manually which allows for an interactive feature curve editing functionality. We demonstrate the usefulness of our technique by applying it to a practical problem scenario in reverse engineering. Here, we consider the problem of generating a statistical (PCA) shape model for car bodies. The crucial step is to establish proper feature correspondences between a large number of input models. Due to the significant shape variation, fully automatic techniques are doomed to failure. With our simple and effective feature curve recovery tool, we can quickly sketch a set of characteristic features on each input model which establishes the correspondence to a pre-defined template mesh and thus allows us to generate the shape model. Finally, we can use the feature curves and the shape model to implement an intuitive modeling metaphor to explore the shape space spanned by the input models.
Impulse-based dynamic simulation on the GPU

In this paper a new, efficient method for dynamic simulation on the GPU is presented. The method is based on an impulse-based approach which is an ideal candidate to simulate on limited hardware due to its simplicity. The proposed method shows how the impulse-based dynamic simulation can benefit from the highly parallel structure of the GPU without suffering too much losses by its limitations. This is achieved by the use of a new way to solve constraints. Most parts of the actual computation can be done in parallel, using only a few number of operations. This allows the implementation to run on a wide range of graphics boards.
@inproceedings{Bayer09,
author = {Daniel Bayer and Jan Bender and Raphael Diziol},
title = {Impulse-based dynamic simulation on the GPU},
booktitle = {Computer Graphics and Visualization (CGV 2009) - IADIS Multi Conference on Computer Science and Information Systems},
year = {2009},
month = jun,
address = {Algarve (Portugal)}
}
In-hand Scanning with Online Loop Closure

We present a complete 3D in-hand scanning system that allows users to scan objects by simply turning them freely in front of a real-time 3D range scanner. The 3D object model is reconstructed online as a point cloud by registering and integrating the incoming 3D patches with the online 3D model. The accumulation of registration errors leads to the well-known loop closure problem. We address this issue already during the scanning session by distorting the object as rigidly as possible. Scanning errors are removed by explicitly handling outliers. As a result of our proposed online modeling and error handling procedure, the online model is of sufficiently high quality to serve as the final model. Thus, no additional post-processing is required which might lead to artifacts in the model reconstruction. We demonstrate our approach on several difficult real-world objects and quantitatively evaluate the resulting modeling accuracy.
Markovian Tracking-by-Detection from a Single, Uncalibrated Camera

We present an algorithm for multi-person tracking-by-detection in a particle filtering framework. To address the unreliability of current state-of-the-art object detectors, our algorithm tightly couples object detection, classification, and tracking components. Instead of relying only on the final, sparse output from a detector, we additionally employ its continuous intermediate output to impart our approach with more flexibility to handle difficult situations. The resulting algorithm robustly tracks a variable number of dynamically moving persons in complex scenes with occlusions. The approach does not rely on background modeling and is based only on 2D information from a single camera, not requiring any camera or ground plane calibration. We evaluate the algorithm on the PETS’09 tracking dataset and discuss the importance of the different algorithm components to robustly handle difficult situations.
Improved Multi-Person Tracking with Active Occlusion Handling

We address the problem of vision-based multi-person tracking in busy inner-city locations using a stereo rig mounted on a mobile platform. Specifically, we are interested in the application of such a system for autonomous navigation and path planning. In such a scenario, semantic information about the moving scene objects becomes important. In order to estimate this robustly, we combine classical geometric world mapping with multi-person detection and tracking. In this paper, we refine an approach presented in earlier work, which jointly estimates camera position, stereo depth, object detections, and trajectories based only on visual information. We analyze the influence of the trajectory generator, which forms part of any tracking-by-detection system, and propose a set of measures to improve its performance. The extensions are experimentally evaluated on challenging, realistic video sequences recorded at busy inner-city locations. The results show that the proposed extensions significantly improve overall system performance, making the resulting detecting and tracking capabilities an interesting component of future navigation system for highly dynamic scenes.
A Framework for Geometry Processing based on Hybrid Surface Representations
We present a framework that allows for the composition of custom-tailored data structures for hybrid representation of geometry and supports the development of associated geometry processing methods. Besides others, a novel hybrid approach for the evaluation of Boolean expressions on polygon meshes is elaborated in this context.
Previous Year (2008)