Efficient Enumeration of Path Query Results for Graph Databases


The main objective of the proposed research project is the development and theoretical investigation of efficient evaluation (especially enumeration) algorithms for path queries for graph databases (which, in a formal setting, are finite, directed, edge-labelled graphs), both in the static setting (i.e., the database is interpreted as a static object) and the dynamic setting (i.e., the database can be updated by insertions and deletions). Our focus is on path query languages that use regular expressions (and variants thereof) as their main navigational feature. The quality of the algorithms (i.e., the asymptotic worst-case running-time) is measured in the preprocessing-time and the delay between the enumeration of two consecutive elements (and, in the dynamic case, also the required update-time), and with respect to the data complexity, i.e., only depending on the size of the database, whereas the size of the query is considered as constant. We aim at polynomial preprocessing-time with low-degree running-time polynomials, constant delay and polylogarithmic, sublinear, or linear update-times. If, for some classes of path queries, we are unable to achieve these strong requirements, we try to substantiate this with lower complexity-bounds (in terms of fine-grained complexity); in general, we concentrate on finding tractable subclasses of path queries and to prove respective dichotomy results.
A conceptional focus will be on the applicability of algorithmic techniques of formal language theory, automata theory, combinatorics on words, and string-algorithmics; in this regard, we stress the fact that, in our formalisation, graph databases coincide with nondeterministic finite automata without initial state or accepting states. The database theory literature already provides elementary automaton-based algorithmic approaches to the evaluation of (regular expression based) path queries, but a thorough and systematic investigation of this connection seems to be still in its beginnings. In particular, a study of the potential of the numerous algorithmic techniques, combinatorial results, and data structures, that are successfully applied in the field of string-algorithmics, is still in a very rudimentary state. Moreover, the dynamic case of path query evaluation naturally leads to questions about nondeterministic automata as dynamic data structures, which is an aspect that so far has found little attention in classical automata theory.

Principal investigators
Schmid, Markus Ph.D. (Details) (Theoretical Computer Science)

Participating organisational units of HU Berlin

Financer
DFG: Eigene Stelle (Sachbeihilfe)

Duration of project
Start date: 08/2019
End date: 01/2023

Research Areas
Theoretical Computer Science

Research Areas
Informatik

Last updated on 2023-20-12 at 05:30