Luka Fuerst (2013) Graph Grammar Parsing and Induction. PhD thesis.
Abstract
Graph grammars are graph replacement systems and can be therefore regarded as a generalization of well-known string grammars. In slightly simplified terms, a graph grammar is composed of a set of initial graphs (axioms) and a set of replacement rules (productions). Every production specifies a possible replacement of a graph or its part with another graph or its part. To apply a production to a given graph, we transform it by applying a replacement specified by the production. In a similar manner as a string grammar can be used to define the syntax of a formal string language (e.g., a programming language), a graph grammar can be employed to define the syntax of a graph language. The language determined by a given graph grammar is a set comprising the grammar's axioms and all graphs that can be obtained by applying the grammar's productions to the axioms and to other graphs from the language. First contribution. A parser is an algorithm that determines whether a given graph G belongs to the language of a given graph grammar GG. If this is the case, then the parser also produces a derivation of the graph G in the grammar GG, i.e., a sequence of production applications that gradually transforms one of the grammar's axioms to the graph G. In contrast to string grammar parsing, graph grammar parsing is a relatively unknown and under-researched problem. Our first scientific contribution presented in this thesis is an improved version of the graph grammar parser proposed by J. Rekers and A. Schürr in issue 1 of the 1997 Journal of Visual Languages and Computing. We improved the parser in several ways. One of the improvements expands the class of grammars accepted by the parser, whereas the other four increase the parser's temporal and spatial efficiency. The improved parser is employed in our graph grammar induction method, which is also presented in this thesis. Second contribution. A graph grammar may be viewed as a compact description of the set of graphs belonging to its language. Our second scientific contribution deals with the problem of graph grammar induction based on a given set of graphs, each of which may be labeled 'positive' or 'negative'. We propose an original algorithm for the construction of a graph grammar that meaningfully generalizes the given graph set. Our induction algorithm starts with a grammar whose language comprises exactly all positive input graphs and none of the negative ones. After that, it gradually builds sets of progressively more general grammars whose languages still do not contain any negative input graph. Guided by Ockham's razor, the algorithm outputs the smallest grammar created in the process. The consistency of individual grammars with the input graph set is verified using the graph grammar parser described in this thesis. Third contribution. Our third scientific contribution pertains to the field of domain specific modeling. The fundamental concepts in this field are the model, which represents a set of entities and their relations in a given modeling domain, and the metamodel, which specifies a set of valid models in the domain. Metamodels and models are often represented by UML class and object diagrams, respectively. Since UML object diagrams take the form of ordinary graphs, it can be said that a metamodel, like a graph grammar, describes a set of graphs. In contrast to a graph grammar, a metamodel describes its graph set declaratively rather than generatively, since it defines the properties to be possessed by all valid models, but not the rules for the automatic generation of valid models. However, such generative rules can be obtained by converting a given metamodel into a graph grammar that defines the same graph set as the metamodel. In the thesis, we present an original metamodel-to-graph-grammar procedure and show how the output graph grammar can be used for semantic analysis or transformation of valid models.
Actions (login required)