From Joins to Aggregates and Optimization Problems
- Speaker(s)
- Dan Olteanu
- Affiliation
- University of Oxford
- Date
- Nov. 22, 2018, 2:15 p.m.
- Room
- room 5440
- Seminar
- PhD Open
The course introduces recent development on solving a host of computational problems in the database. The unifying theme underlying this development is the use of the structure of the problem to avoid redundancy in data representation and computation.
The first part overviews recent work on worst-case optimal join algorithms for two representation formalisms for relational data, the standard listing representation and the factorized representation.
The second part goes beyond join queries and looks at functional aggregate queries, which are ubiquitous across Computer Science and can express problems in logic, probabilistic graphical models, CSP, linear algebra, and databases. This part also sketches on-going development on the computation of queries under data updates, with a particular focus on counting triangles under updates in worst-case optimal time.
The third part shows how the development introduced in the first two parts can be applied to the in-database computation of optimization problems used in machine learning, such as learning regression and classification models. Time permitting, it also highlights on-going development on in-database computation for linear algebra such as the Gram-Schmidt QR decomposition and the low-rank (PCA) factorization of matrices defined by database joins.
Each of the three parts concludes with incomplete lists of exciting open research problems and of highly relevant work that builds on the concepts presented in this course, and a quiz.
While originally theoretical in nature, the ideas presented in this course have led to practical and very competitive tools reported by both academia and industry, such as the LeapFrog TrieJoin algorithm used by the LogicBlox commercial database system, the FDB algorithm for factorized query computation, and the AC/DC system for learning models over relational data.
See http://phdopen.mimuw.edu.pl/index.php?page=z18w4 for more information.