Mikołaj Bojańczyk

Projects for polyregular functions


Here is a list of open problems and research projects about polyregular functions. Some are good for Masters’ theses, some for PhDs, and all are good for research. If you are interested about any of them, do not hesitate to contact me. I also give some estimated difficulties, but naturally these estimates could be wrong, since the problems have not been solved. Unknown difficulty does not mean that the problem is super hard, it just means that I have not been thinking about it enough to estimate the difficulty.

 


Variants of polyregular functions

 

Minimal primes

Difficulty: a safe Masters’ thesis topic, because there is always something that can be proved

Consider the linear list functions, or the polynomial list functions. Analyse which of the prime functions are really necessary. For example, what happens if we remove reverse from the prime functions?

 

For hierarchy

Difficulty: unknown

Give an algorithm that decides if a polyregular function can be implemented using a for transducer with at most nested loops. (No idea how to do this.)

 

Uniqueness of the model

Difficulty: unknown

Find some elegant axioms such that the polyregular functions are the unique class of functions that satisfy them. For example, continuity, closure under composition, and polynomial outputs. There have to be more axioms, since there are non-computable functions that are continuous and have polynomial size outputs.

 

Infinite words

Difficulty: unknown

Generalize the theory of polyregular functions to ω-words, or other kinds of infinite words, such as countable linear orders.

 


Algorithms

 

Random access in logarithmic time

Difficulty: should be a non-trivial, but doable Masters’ thesis topic

Fix a polyregular function f : \Sigma^* \to \Gamma^*. Find an algorithm which does the following for every input string w \in \Sigma^* of length n:

Linear time preprocessing. It prepares a data structure in linear time O(n).

Logarithmic time access. Using the data structure, it answers the following queries in logarithmic time O(\log n): what is the letter in the i-th position of the output string?

It would also be ok to have time that is polylogarithmic, e.g. \log^2(n).

 

Equal outputs

Difficulty: should be a non-trivial, but doable Masters’ thesis topic

Fix a polyregular function. Find an algorithm which inputs two input strings, and decides in linear time if the corresponding outputs are equal.

 

Polyregular compression

Difficulty: should be a non-trivial, but doable Masters’ thesis topic

A straight-line program is a sequence of instructions, where each new instruction creates a string that is either a single letter from the alphabet, or obtained by concatenating two strings that were created in previous instructions. The last line of a program represents some string, which can be exponential in the length of the program. Such programs are used to compress strings, e.g. in Lempel-Ziv compression. A compressed string can still be operated on, e.g. we can compute the letters at a given position, or search it for substrings etc. One idea would be to extend the syntax of compression with polyregular functions, i.e. a line of the program could be X := f(Y) where f is some polyregular function. The goal of this project is to show that under this more general compression, one can still implement operations.

 

Solve the equivalence problem

Difficulty: I’ve been working on this unsuccessfully for many years

Give an algorithm that decides if two polyregular functions are equivalent, i.e. have the same outputs.

 

Solve the equivalence problem, equationally

Difficulty: the version for linear regular would be good PhD topic, with some Masters’ warmup.

This problem makes sense already for transducer classes where the equivalence problem is decidable, e.g. Mealy machines or linear regular functions. The goal is to write a set of equations, possibly using the syntax of combinators and prime functions, such that if two transducers are equivalent, then the code of one can be transformed into the code of the other using the equations. An example of such an equation is that reversing twice is the identity.

 


Beyond strings

 

Height reduction

Difficulty: unknown

Prove or disprove: there is an MSO interpretation that inputs trees of arbitrary height, and outputs trees of height at constant height (say 3), and which is injective: non-isomorphic inputs yield non isomorphic outputs. The trees have unordered siblings. (When the trees have ordered siblings, then this can be done with output trees of height 3.)

 

Extend polyregular functions to trees and graphs

Difficulty: a good PhD topic, with some Masters’ warmup.

Extend the theory of polyregular functions to trees and graphs (possibly, only tree-like graphs).  This is a long theory-building project.

 

Deterministically definable tree decompositions

Difficulty: unknown

It is known that tree decompositions can be computed by nondeterministic MSO transductions (i.e transductions that are linear, but can guess a colouring of the input graph). Maybe at the cost of extending to interpretations that are polynomial (i.e. have dimension > 1), we can make them deterministic?