You are not logged in | Log in

The seminar is devoted to the theory and practice of data management and knowledge representation. We are interested in challenges related to the processing of data, queries, and metadata (schemas, constraints, dependencies, ontologies), ranging from designing and analyzing abstract formalisms all the way to database systems architecture and distributed processing of big data. We like our data in all flavors: not only relational, but also semistructured (XML, JSON), graph (RDF, LPG), object, text, temporal, stream, GIS, and others.

The problems tackled can be theoretical, requiring tools from algorithmics, combinatorics, logic (e.g. finite model theory), and automata theory, as well as very practical, in the spirit of systems and software engineering. MSc theses written within our seminar may study decidability and complexity of abstract problems, design algorithms and heuristics, implement and experiment with existing theoretical solutions, or analyze, compare and extend existing systems.

We meet and discuss with experts in other disciplines, who sometimes supply ideas for MSc theses. We have cooperated or are currently cooperating with astronomers, chemists, and geographers. We are also open for other areas where databases can be applied.

Seminar presentations are usually based on recent papers presented at leading international conferences devoted to data management and knowledge representation, such as VLDB, PODS, SIGMOD, or KR.

Selected topics:

Data models, semantics, query languages
Data provenance
Databases for emerging hardware
Distributed and parallel databases
Graph data management, RDF, social networks, Semantic Web
Knowledge discovery, clustering, data mining
Machine learning for data management and vice versa
Model theory, logics, algebras, computational complexity
Ontology-based data access, data integration and exchange, metadata management
Ontology formalisms and models, description logics
Privacy, security, ethics
Query processing and optimization
Scientific databases
Semi-structured data
Small data, end-user programming
Storage, indexing, and physical database design
Streams, sensor networks, complex event processing
Transaction processing
Uncertainty, incompleteness, and inconsistency in data management

Organizers

dr hab. Filip Murlak, prof. ucz.
dr hab. Jacek Sroka
prof. dr hab. Krzysztof Stencel
prof. dr hab. Jerzy Tyszkiewicz

Information

Tuesdays, 10:15 a.m. , room: 4060

Home page

https://sites.google.com/view/sembdmimuw?pli=1&authuser=1

Research fields

List of talks

April 29, 2025, 10:15 a.m.
Marta Jadwiga Burzańska (UMK)
Heuristic algorithm for periodic patterns discovery in a database workload reconstruction (Heuristic algorithm for periodic patterns discovery in a database workload reconstruction)
Information about the existence of periodic patterns in a database workload can play a big part in the process of database tuning. However, full analysis of audit trails can be cumbersome and time-consuming. This talk …
April 15, 2025, 10:15 a.m.
Jakub Kłos (MIMUW)
Differentially Private Data Release over Multiple Tables (Differentially Private Data Release over Multiple Tables)
April 8, 2025, 10:15 a.m.
Katarzyna Mielnik (MIMUW)
Efficiently Processing Joins and Grouped Aggregations on GPUs (Efficiently Processing Joins and Grouped Aggregations on GPUs)
March 25, 2025, 10:15 a.m.
Marcin Mordecki (MIMUW)
Analiza wpływu wykorzystania instrukcji SIMD na wydajność przetwarzania
W pierwszym semestrze przyjrzeliśmy się architekturze SIMD oraz potencjalnym zyskom i pułapkom, jakie wiążą się z wektoryzacją kodu za pomocą instrukcji AVX. Kontynuując ten temat, tym razem zbadamy, co dokładnie daje wykorzystanie najnowszego zestawu tychże …
March 18, 2025, 10:15 a.m.
Michał J. Gajda (Well. co)
Zamienianie tabel w strumienie zdarzeń przyrostowych i odwrotnie
Ze względu na wydajność, duże bazy danych często utrzymuje się w ten sposób, że dzielimy je na strumienie zdarzeń i "zmaterializowane" tabele lub perspektywy. W zależności od zastosowania chcielibyśmy przetwarzać przyrostowy strumień zdarzeń albo tabelę …
March 4, 2025, 10:15 a.m.
Krzysztof Żyndul (MIMUW)
ALEX: An Updatable Adaptive Learned Index (ALEX: An Updatable Adaptive Learned Index)
Feb. 25, 2025, 10:15 a.m.
Alexandra Rogova (MIMUW)
Dangers of List Processing in Querying Property Graphs (Dangers of List Processing in Querying Property Graphs)
The focus of graph databases is graph-like data, i.e. data that represents heavily-linked information where the topology is an important aspect. The workhorse of graph query languages is pattern matching. The result of pattern matching …
Jan. 21, 2025, 10:15 a.m.
Damian Werpachowski (MIMUW)
Implementation of UDP network stack for Java using ef_vi (Implementation of UDP network stack for Java using ef_vi)
Jan. 14, 2025, 10:15 a.m.
Michał Molas (MIMUW)
LadderFilter: Filtrowanie rzadkich elementów przy niewielkim zużyciu pamięci i czasu (LadderFilter: Filtering Infrequent Items with Small Memory and Time Overhead)
Jan. 7, 2025, 10:15 a.m.
Katarzyna Mielnik (MIMUW)
Lemo: A Cache-Enhanced Learned Optimizer for Concurrent Queries (Lemo: A Cache-Enhanced Learned Optimizer for Concurrent Queries)
Realizacja wielu zapytań w krótkim czasie ma szerokie zastosowanie praktyczne. Aby jednak osiągnąć wysoką wydajność, kluczowe jest zminimalizowanie powtarzających się obliczeń oraz opracowanie efektywnego planu wykonania współbieżnych zapytań. W metodzie Lemo zastosowano wytrenowaną sieć, która …
Dec. 17, 2024, 10:15 a.m.
Zuzanna Surowiec (MIMUW)
Low-Latency Adaptive Distributed Stream Join System Based on a Flexible Join Model
W moim referacie przybliżę systemy przetwarzania strumieniowego oraz przedstawię problem łączeń strumieniowych na dowolnych predykatach. Omówię istniejące modele łączenia strumieniowego na przykładzie MatrixModel oraz BicliqueModel, rozważając ich wady i zalety. Podczas referatu, skupię się na …
Dec. 10, 2024, 10:15 a.m.
Agata Bielenica (MIMUW)
Obliczanie wartości Shapleya faktów w odpowiadaniu na zapytania (Computing the Shapley Value of Facts in Query Answering)
W referacie podejmę problem wyjaśniania, dlaczego dane zapytanie bazy danych daje określony wynik. Posłuży do tego teorio-growe pojęcie wartości Shapleya. Intuicyjnie, wartość Shapleya dla pewnego faktu z bazy danych, zapytania i krotki reprezentuje, jak bardzo …
Dec. 3, 2024, 10:15 a.m.
Jakub Kłos (MIMUW)
Szybkie ciągłe dopasowywanie podgrafów w strumieniowych grafach za pomocą redukcji cofania
Nov. 26, 2024, 10:15 a.m.
Michał Garbacz (MIMUW)
Continual release of differentially private synthetic data (Continual release of differentially private synthetic data)
Nov. 12, 2024, 10:15 a.m.
Maciej Herdon (MIMUW)
Supporting Descendants in SIMD-Accelerated JSONPath