Deriving Sparsest Approximate Bayesian Belief Networks from Data: NP-hardness and Beyond
- Speaker(s)
- Dominik Ślęzak
- Affiliation
- MIMUW & QED Software
- Date
- May 5, 2023, 4:15 p.m.
- Information about the event
- 4060 & online: meet.google.com/jbj-tdsr-aop
- Seminar
- Seminar Intelligent Systems
alpha-Approximate Bayesian Belief Network (alpha-BBN) is a directed acyclic graph (DAG) whose entropy, understood as the sum of entropies of conditional distributions generated from a joint probability distribution along the DAG's structure, is not higher than the entropy of that distribution by more than alpha. Such formulation allows us to explictly control the amount of information that we agree to lose while operating with graphical models of this kind. If we put alpha = 0, then the corresponding DAGs need to fully represent the underlying joint distributions in a classical way. If we agree for alpha > 0, then the corresponding DAGs become potentially sparser (in terms of the number of edges) but with the mathematical guarantee that any information inferred from their structures remains approximately accurate (subject to alpha). For the fixed alpha, if we treat a relational data set as a joint probability distribution generator (whereby data columns correspond to discrete random variables and probabilities are computed directly from the data), we can state the optimization problem of deriving the sparsest possible alpha-BBN for that distribution (for that data set). It is known that for alpha = 0 this problem is NP-hard. In this paper, we prove it for the arbitrary non-negative alpha. For any fixed alpha, we show how to reduce the well-known minimum dominating set problem to our sparsest alpha-BBN problem, whereas any undirected graph being the input to the former problem is polynomially encoded as the corresponding data set being the input to the latter problem. Moreover, we show that for a wide range of fixed alphas the considered sparsest alpha-BBN problem is not only NP-hard but also inapproximable.