Faculty of Engineering and Natural Sciences
Periodicity in Data Streams
Simon Fraser University, Canada
In this work we study sublinear space algorithms for detecting periodicity over data streams. A sequence of length n is said to be periodic if it consists of repetitions of a block of length p for some p <= n/2. In the first part of this paper, we give a 1-pass randomized streaming algorithm that uses polylogarithmic space and reports the shortest period if the given stream is periodic. At the heart of this result is a 1-pass O(log n log m) space streaming pattern matching algorithm. This algorithm does not need an offline pre-processing stage. In the second part, we study distance to p-periodicity under the Hamming metric, where we estimate the minimum number of character substitutions needed to make a given sequence p-periodic. In streaming terminology, this problem can be described as computing the cascaded aggregate L1 . F res(1) 1 over a matrix A given in column ordering. For this problem, we present a randomized streaming algorithm with approximation factor 2 + eps that takes O(eps-2) space. We also show a 1+eps randomized streaming algorithm which uses O(eps-5.5p1/2) space. (joint work with Hossein Jowhari and Mert Saglam)
Short Bio: Funda Ergun got her BS from Bilkent University and PhD from Cornell University. After that she worked at Bell Labs and Case Western University until her move to Simon Fraser University, where she is an associate professor. Her areas of interest are randomized algorithms, algorithms on massive data sets, and network algorithms.
July 19, 2011, 15:40, FENS 2019