Moshe Y. Vardi
University Professor and George Distinguished Service Professor in Computational Engineering at Rice University, Fellow for Science and Technology Policy at the Baker Institute for Public Policy, Houston, Texas, USA
Title: A Theory of Regular Queries
Abstract: A major theme in relational database theory is navigating the tradeoff between expressiveness and tractability for query languages, where the query-containment problem is considered a benchmark of tractability. The query class UCQ, consisting of unions of conjunctive queries, is a fragment of first-order logic that has a decidable query containment problem, but its expressiveness is limited. Extending UCQ with recursion yields Datalog, an expressive query language that has been studied extensively and has recently become popular in application areas such as declarative net working. Unfortunately, Datalog has an undecidable query containment problem. Identifying a fragment of Datalog that is expressive enough for applications but has a decid able query-containment problem was an open problem for several years.
In the area of graph databases, there has been a similar search for a query language that combines expressiveness and tractability. Because of the need to navigate along graph paths of unspecified length, transitive closure has been considered a fundamental operation. Query classes of increasing complexity using the operations of disjunction, conjunction, projection, and transitive closure have been studied, but the classes lacked natural closure properties. The class RQof regular queries emerged about a decade ago as a natural query class that is closed under all of its operations and has a decidable query-containment problem.
RQ turned out to be a fragment of Datalog where recursion can be used only to express transitive closure. Further more, it turns out that applying this idea to Datalog, that is, restricting recursion to the expression of transitive closure, does yield the long-sought goal an expressive fragment of Datalog with a decidable query-optimization problem.
Phokion G. Kolaitis
University of California Santa Cruz and IBM Research
Title: Local Consistency vs. Global Consistency on Annotated Data
Abstract: In several different settings, one comes across situations in which the objects of study are locally consistent but globally inconsistent. Earlier work in probability theory in the 1960s and in relational database theory in the 1980s produced characterizations of when local consistency implies global consistency. The aim of this talk is to present a unifying framework for investigating the interplay between local and global consistency by considering K-relations, i.e., relations in which every tuple is annotated with a value from some positive monoid K. The main result characterizes the positive monoids K for which every pairwise consistent collection of K-relations over an acyclic schema is globally consistent; special cases include the earlier results about local vs. global consistency for relational databases and more recent results about local vs. global consistency for bag databases.
This is joint work with Albert Atserias at UPC Barcelona.
Paolo Papotti
Associate Professor at EURECOM, Campus SophiaTech, Biot, France
Title: SQL and Large Language Models: A Marriage Made in Heaven?
Abstract: With the rise of pre-trained Large Language Models (LLMs), there is now an effective solution to store and use information extracted from massive corpora of documents. However, for data-intensive tasks over structured data, relational DBs and SQL queries are at the core of countless applications. While these two technologies may appear distant, in this talk we will see that they can interact effectively and with promising results. LLMs can help users express SQL queries (Semantic Parsing), but SQL queries can be used to evaluate LLMs (Benchmarking). Their combination can be further advanced, with opportunities to query with a unified SQL interface both LLMs and DBs. We present recent results on these topics and then conclude with an overview of the research challenges in effectively leveraging the combined power of SQL and LLMs.
SEBD 2025 - 33rd Symposium On Advanced Database Systems