New algorithms on compressed texts

Yuri Lifshits

Petersburg Department of Steklov Institute of Mathematics

Monday, 20 March 2006, 14:00 (note the unusual weekday)
Cybernetica Bldg (Akadeemia tee 21), room B101

Slides from the talk [pdf]

Abstract: There is a new topic in algorithm design: processing compressed objects without unpacking. For a number of problems, we now have a more efficient algorithms than just "unpack-and-solve". In the talk, I will survey the field and present a series of new algorithms.

The central result is an algorithm for fully compressed pattern matching. For given compressed strings P and T (Lempel-Ziv compression), we determine whether P is a substring in T. The algorithm works in cubic time from the COMPRESSED size of input. The previously known algorithm (1997) works in n^4. Then algorithms for finding periods, subsequences and covers will be presented. However, computing Hamming distance and longest common subsequence is NP-hard for compressed strings.

The talk will be concluded by a series of open problems in complexity theory and algorithms.

Tarmo Uustalu
Last update 27.2.2006