Linear-Time Algorithm for Long LCF with k Mismatches

Abstract

In the Longest Common Factor with k Mismatches (LCFk) problem, we are given two strings X and Y of total length n, and we are asked to find a pair of maximal-length factors, one of X and the other of Y, such that their Hamming distance is at most k. Thankachan et al. [Thankachan et al. 2016] show that this problem can be solved in O(nlogkn) time and O(n) space for constant k. We consider the LCFk(l) problem in which we assume that the sought factors have length at least l. We use difference covers to reduce the LCFk(l) problem with l=Ω(log2k+2n) to a task involving m=O(n/logk+1n) synchronized factors. The latter can be solved in O(mlogk+1m) time, which results in a linear-time algorithm for LCFk(l) with l=Ω(log2k+2n). In general, our solution to the LCFk(l) problem for arbitrary l takes O(n+nlogk+1n/l) time.

Publication
CPM pp:23:1-23:16
Date