Ehemaliger Mitarbeiter des Institutes
Professur für Photogrammetrie
Nussallee 15
53115 Bonn


E-Mail:


Wissenschaftlicher Mitarbeiter

Curriculum Vitae

.

Research Interests

  • Graphical Models
  • Image Segmentation and Classification
  • Probabilistic Discriminative Models
  • Supervised Learning
.

Project

Semi-automatic generation of highly detailed textured building models
part of the sino german bundle project Interoperation of 3D Urban Geoinformation of the German Research Foundation (DFG)

.

Publications

2011

Michael Ying Yang and Wolfgang Förstner, "Feature Evaluation for Building Facade Images - An Empirical Study", International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXIX-B3, pp. 513-518. 2011.

The classification of building facade images is a challenging problem that receives a great deal of attention in the photogrammetry community. Image classification is critically dependent on the features. In this paper, we perform an empirical feature evaluation task for building facade images. Feature sets we choose are basic features, color features, histogram features, Peucker features, texture features, and SIFT features. We present an approach for region-wise labeling using an efficient randomized decision forest classifier and local features. We conduct our experiments with building facade image classification on the eTRIMS dataset, where our focus is the object classes building, car, door, pavement, road, sky, vegetation, and window.

@article{Yang2011Feature,
  author = {Yang, Michael Ying and F\"orstner, Wolfgang},
  title = {Feature Evaluation for Building Facade Images - An Empirical Study},
  journal = {International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences},
  year = {2011},
  volume = {XXXIX-B3},
  pages = {513--518},
  doi = {10.5194/isprsarchives-XXXIX-B3-513-2012}
}

Michael Ying Yang and Wolfgang Förstner, "A Hierarchical Conditional Random Field Model for Labeling and Classifying Images of Man-made Scenes", In International Conference on Computer Vision, IEEE/ISPRS Workshop on Computer Vision for Remote Sensing of the Environment. 2011.

Semantic scene interpretation as a collection of meaningful regions in images is a fundamental problem in both photogrammetry and computer vision. Images of man-made scenes exhibit strong contextual dependencies in the form of spatial and hierarchical structures. In this paper, we introduce a hierarchical conditional random field to deal with the problem of image classification by modeling spatial and hierarchical structures. The probability outputs of an efficient randomized decision forest classifier are used as unary potentials. The spatial and hierarchical structures of the regions are integrated into pairwise potentials. The model is built on multi-scale image analysis in order to aggregate evidence from local to global level. Experimental results are provided to demonstrate the performance of the proposed method using images from eTRIMS dataset, where our focus is the object classes building, car, door, pavement, road, sky, vegetation, and window.

@inproceedings{Yang2011Hierarchicala,
  author = {Yang, Michael Ying and F\"orstner, Wolfgang},
  title = {A Hierarchical Conditional Random Field Model for Labeling and Classifying Images of Man-made Scenes},
  booktitle = {International Conference on Computer Vision, IEEE/ISPRS Workshop on Computer Vision for Remote Sensing of the Environment},
  year = {2011},
  doi = {10.1109/ICCVW.2011.6130243}
}

Michael Ying Yang and Wolfgang Förstner, "Regionwise Classification of Building Facade Images", In Photogrammetric Image Analysis (PIA2011)., pp. 209 - 220. Springer. 2011.

In recent years, the classification task of building facade images receives a great deal of attention in the photogrammetry community. In this paper, we present an approach for regionwise classification using an efficient randomized decision forest classifier and local features. A conditional random field is then introduced to enforce spatial consistency between neighboring regions. Experimental results are provided to illustrate the performance of the proposed methods using image from eTRIMS database, where our focus is the object classes building, car, door, pavement, road, sky, vegetation, and window.

@inproceedings{Yang2011Regionwise,
  author = {Yang, Michael Ying and F\"orstner, Wolfgang},
  title = {Regionwise Classification of Building Facade Images},
  booktitle = {Photogrammetric Image Analysis (PIA2011)},
  publisher = {Springer},
  year = {2011},
  pages = {209 -- 220},
  note = {Stilla, Uwe / Rottensteiner, Franz / Mayer, H. / Jutzi, Boris / Butenuth, Matthias (Hg.); Munich},
  doi = {10.1007/978-3-642-24393-6_18}
}

Michael Ying Yang, "Hierarchical and Spatial Structures for Interpreting Images of Man-made Scenes Using Graphical Models". Thesis at: Institute of Photogrammetry, University of Bonn. 2011.

Summary
The task of semantic scene interpretation is to label the regions of an image and their relations into meaningful classes. Such task is a key ingredient to many computer vision applications, including object recognition, 3D reconstruction and robotic perception. It is challenging partially due to the ambiguities inherent to the image data. The images of man-made scenes, e. g. the building facade images, exhibit strong contextual dependencies in the form of the spatial and hierarchical structures. Modelling these structures is central for such interpretation task. Graphical models provide a consistent framework for the statistical modelling. Bayesian networks and random fields are two popular types of the graphical models, which are frequently used for capturing such contextual information. The motivation for our work comes from the belief that we can find a generic formulation for scene interpretation that having both the benefits from random fields and Bayesian networks. It should have clear semantic interpretability. Therefore our key contribution is the development of a generic statistical graphical model for scene interpretation, which seamlessly integrates different types of the image features, and the spatial structural information and the hierarchical structural information defined over the multi-scale image segmentation. It unifies the ideas of existing approaches, e. g. conditional random field (CRF) and Bayesian network (BN), which has a clear statistical interpretation as the maximum a posteriori (MAP) estimate of a multi-class labelling problem. Given the graphical model structure, we derive the probability distribution of the model based on the factorization property implied in the model structure. The statistical model leads to an energy function that can be optimized approximately by either loopy belief propagation or graph cut based move making algorithm. The particular type of the features, the spatial structure, and the hierarchical structure however is not prescribed. In the experiments, we concentrate on terrestrial man-made scenes as a specifically difficult problem. We demonstrate the application of the proposed graphical model on the task of multi-class classification of building facade image regions. The framework for scene interpretation allows for significantly better classification results than the standard classical local classification approach on man-made scenes by incorporating the spatial and hierarchical structures. We investigate the performance of the algorithms on a public dataset to show the relative importance ofthe information from the spatial structure and the hierarchical structure. As a baseline for the region classification, we use an efficient randomized decision forest classifier. Two specific models are derived from the proposed graphical model, namely the hierarchical CRF and the hierarchical mixed graphical model. We show that these two models produce better classification results than both the baseline region classifier and the flat CRF.
Zusammenfassung
Ziel der semantischen Bildinterpretation ist es, Bildregionen und ihre gegenseitigen Beziehungen zu kennzeichnen und in sinnvolle Klassen einzuteilen. Dies ist eine der Hauptaufgabe in vielen Bereichen des maschinellen Sehens, wie zum Beispiel der Objekterkennung, 3D Rekonstruktion oder der Wahrnehmung von Robotern. Insbesondere Bilder anthropogener Szenen, wie z.B. Fassadenaufnahmen, sind durch starke räumliche und hierarchische Strukturen gekennzeichnet. Diese Strukturen zu modellieren ist zentrale Teil der Interpretation, für deren statistische Modellierung graphische Modelle ein geeignetes konsistentes Werkzeug darstellen. Bayes Netze und Zufallsfelder sind zwei bekannte und häufig genutzte Beispiele für graphische Modelle zur Erfassung kontextabhängiger Informationen. Die Motivation dieser Arbeit liegt in der Überzeugung, dass wir eine generische Formulierung der Bildinterpretation mit klarer semantischer Bedeutung finden können, die die Vorteile von Bayes Netzen und Zufallsfeldern verbindet. Der Hauptbeitrag der vorliegenden Arbeit liegt daher in der Entwicklung eines generischen statistischen graphischen Modells zur Bildinterpretation, welches unterschiedlichste Typen von Bildmerkmalen und die räumlichen sowie hierarchischen Strukturinformationen über eine multiskalen Bildsegmentierung integriert. Das Modell vereinheitlicht die existierender Arbeiten zugrunde liegenden Ideen, wie bedingter Zufallsfelder (conditional random field (CRF)) und Bayesnetze (Bayesian network (BN)). Dieses Modell hat eine klare statistische Interpretation als Maximum a posteriori (MAP) Schätzer eines mehr klassen Zuordnungsproblems. Gegeben die Struktur des graphischen Modells und den dadurch definierten Faktorisierungseigenschaften leiten wir die Wahrscheinlichkeitsverteilung des Modells ab. Dies führt zu einer Energiefunktion, die näherungsweise optimiert werden kann. Der jeweilige Typ der Bildmerkmale, die räumliche sowie hierarchische Struktur ist von dieser Formulierung unabhängig. Wir zeigen die Anwendung des vorgeschlagenen graphischen Modells anhand der mehrklassen Zuordnung von Bildregionen in Fassadenaufnahmen. Wir demonstrieren, dass das vorgeschlagene Verfahren zur Bildinterpretation, durch die Berücksichtigung räumlicher sowie hierarchischer Strukturen, signifikant bessere Klassifikationsergebnisse zeigt, als klassische lokale Klassifikationsverfahren. Die Leistungsfähigkeit des vorgeschlagenen Verfahrens wird anhand eines öffentlich verfügbarer Datensatzes evaluiert. Zur Klassifikation der Bildregionen nutzen wir ein Verfahren basierend auf einem effizienten Random Forest Klassifikator. Aus dem vorgeschlagenen allgemeinen graphischen Modell werden konkret zwei spezielle Modelle abgeleitet, ein hierarchisches bedingtes Zufallsfeld (hierarchical CRF) sowie ein hierarchisches gemischtes graphisches Modell. Wir zeigen, dass beide Modelle bessere Klassifikationsergebnisse erzeugen als die zugrunde liegenden lokalen Klassifikatoren oder die einfachen bedingten Zufallsfelder.

@phdthesis{Yang2011Hierarchical,
  author = {Michael Ying Yang},
  title = {Hierarchical and Spatial Structures for Interpreting Images of Man-made Scenes Using Graphical Models},
  school = {Institute of Photogrammetry, University of Bonn},
  year = {2011}
}

2010

Michael Ying Yang and Wolfgang Förstner and Martin Drauschke, "Hierarchical Conditional Random Field for Multi-class Image Classification", In International Conference on Computer Vision Theory and Applications (VISSAPP)., pp. 464-469. 2010.

Multi-class image classification has made significant advances in recent years through the combination of local and global features. This paper proposes a novel approach called hierarchical conditional random field (HCRF) that explicitly models region adjacency graph and region hierarchy graph structure of an image. This allows to set up a joint and hierarchical model of local and global discriminative methods that augments conditional random field to a multi-layer model. Region hierarchy graph is based on a multi-scale watershed segmentation.

@inproceedings{Yang2010Hierarchical,
  author = {Yang, Michael Ying and F\"orstner, Wolfgang and Drauschke, Martin},
  title = {Hierarchical Conditional Random Field for Multi-class Image Classification},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISSAPP)},
  year = {2010},
  pages = {464--469}
}

Michael Ying Yang and Yanpeng Cao and Wolfgang Förstner and John McDonald, "Robust wide baseline scene alignment based on 3D viewpoint normalization", In International Conference on Advances in Visual Computing., pp. 654-665. Springer-Verlag. 2010.

This paper presents a novel scheme for automatically aligning two widely separated 3D scenes via the use of viewpoint invariant features. The key idea of the proposed method is following. First, a number of dominant planes are extracted in the SfM 3D point cloud using a novel method integrating RANSAC and MDL to describe the underlying 3D geometry in urban settings. With respect to the extracted 3D planes, the original camera viewing directions are rectified to form the front-parallel views of the scene. Viewpoint invariant features are extracted on the canonical views to provide a basis for further matching. Compared to the conventional 2D feature detectors (e.g. SIFT, MSER), the resulting features have following advantages: (1) they are very discriminative and robust to perspective distortions and viewpoint changes due to exploiting scene structure; (2) the features contain useful local patch information which allow for efficient feature matching. Using the novel viewpoint invariant features, wide-baseline 3D scenes are automatically aligned in terms of robust image matching. The performance of the proposed method is comprehensively evaluated in our experiments. It's demonstrated that 2D image feature matching can be significantly improved by considering 3D scene structure.

@inproceedings{Yang2010Robust,
  author = {Yang, Michael Ying and Cao, Yanpeng and F\"orstner, Wolfgang and McDonald, John},
  title = {Robust wide baseline scene alignment based on 3D viewpoint normalization},
  booktitle = {International Conference on Advances in Visual Computing},
  publisher = {Springer-Verlag},
  year = {2010},
  pages = {654--665},
  doi = {10.1007/978-3-642-17289-2_63}
}

Michael Ying Yang and Wolfgang Förstner, "Plane Detection in Point Cloud Data"(TR-IGG-P-2010-01 ) 2010.

Plane detection is a prerequisite to a wide variety of vision tasks. RANdom SAmple Consensus (RANSAC) algorithm is widely used for plane detection in point cloud data. Minimum description length (MDL) principle is used to deal with several competing hypothesis. This paper presents a new approach to the plane detection by integrating RANSAC and MDL. The method could avoid detecting wrong planes due to the complex geometry of the 3D data. The paper tests the performance of proposed method on both synthetic and real data.

@techreport{Yang2010Plane,
  author = {Yang, Michael Ying and F\"orstner, Wolfgang},
  title = {Plane Detection in Point Cloud Data},
  year = {2010},
  number = {TR-IGG-P-2010-01 }
}

2009

Jörg Schmittwilken and Michael Ying Yang and Wolfgang Förstner and Lutz Plümer, "Integration of conditional random fields and attribute grammars for range data interpretation of man-made objects", Annals of GIS. Vol. 15(2), pp. 117-126. 2009.

A new concept for the integration of low- and high-level reasoning for the interpretation of images of man-made objects is described. The focus is on the 3D reconstruction of facades, especially the transition area between buildings and the surrounding ground. The aim is the identification of semantically meaningful objects such as stairs, entrances, and windows. A low-level module based on randomsample consensus (RANSAC) algorithmgenerates planar polygonal patches. Conditional random fields (CRFs) are used for their classification, based on local neighborhood and priors fromthe grammar. An attribute grammar is used to
represent semantic knowledge including object partonomy and observable geometric constraints. The AND-OR tree-based parser uses the precision of the classified patches to control the reconstruction process and to optimize the sampling mechanism of RANSAC. Although CRFs are close to data, attribute grammars make the high-level structure of objects explicit and translate semantic knowledge in observable geometric constraints. Our approach combines top-down and bottom-up reasoning by integrating CRF and attribute grammars and thus exploits the complementary strengths of these methods.

@article{Schmittwilken2009Integration,
  author = {Schmittwilken, J\"org and Yang, Michael Ying and F\"orstner, Wolfgang and Pl\"umer, Lutz},
  title = {Integration of conditional random fields and attribute grammars for range data interpretation of man-made objects},
  journal = {Annals of GIS},
  year = {2009},
  volume = {15},
  number = {2},
  pages = {117--126},
  doi = {10.1080/19475680903464696}
}

Ying Yang, "Remote sensing image registration via active contour model", International Journal of Electronics and Communications. Vol. 65, pp. 227-234. 2009.

Image registration is the process by which we determine a transformation that provides the most accurate match between two images. The search for the matching transformation can be automated with the use of a suitable metric, but it can be very time-consuming and tedious. In this paper, we introduce a registration algorithm that combines active contour segmentation together with mutual information. Our approach starts with a segmentation procedure. It is formed by a novel geometric active contour, which incorporates edge knowledge, namely Edgeflow, into active contour model. Two edgemap images filled with closed contours are obtained. After ruling out mismatched curves, we use mutual information (MI) as a similarity measure to register two edgemap images. Experimental results are provided to illustrate the performance of the proposed registration algorithm using both synthetic and multisensor images. Quantitative error analysis is also provided and several images are shown for subjective evaluation.

@article{Yang2009Remote,
  author = {Yang, Ying},
  title = {Remote sensing image registration via active contour model},
  journal = {International Journal of Electronics and Communications},
  year = {2009},
  volume = {65},
  pages = {227--234},
  doi = {10.1016/j.aeue.2008.01.003}
}

Michael Ying Yang, "Multiregion Level-set Segmentation of Synthetic Aperture Radar Images", In IEEE International Conference on Image Processing. Cairo, pp. 1717-1720. 2009.

Due to the presence of speckle, segmentation of SAR images is generally acknowledged as a difficult problem. A large effort has been done in order to cope with the influence of speckle noise on image segmentation such as edge detection or direct global segmentation. Recent works address this problem by using statistical image representation and deformable models. We suggest a novel variational approach to SAR image segmentation, which consists of minimizing a functional containing an original observation term derived from maximum a posteriori (MAP) estimation framework and a Gamma image representation. The minimization is carried out efficiently by a new multiregion method which embeds a simple partition assumption directly in curve evolution to guarantee a partition of the image domain from an arbitrary initial partition. Experiments on both synthetic and real images show the effectiveness of the proposed method.

@inproceedings{Yang2009Multiregion,
  author = {Yang, Michael Ying},
  title = {Multiregion Level-set Segmentation of Synthetic Aperture Radar Images},
  booktitle = {IEEE International Conference on Image Processing},
  year = {2009},
  pages = {1717--1720},
  doi = {10.1109/ICIP.2009.5413378}
}
.

Misc

  • December 2009: Participation in the Sino-German Workshop 2009 - “Dynamic Maps”, Wuhan, China.
  • October 2009: Participation in the Bonn Vision Workshop.
  • August 2009: One-week Vision and Sports Summer School, Zurich, Switzerland.
  • June 2009: Participation in the Sino-German Workshop, Hanover, Germany.
  • Since June 2009: Member in Theodor-Brinkmann-Graduate School
.
.