Faculty of Engineering and Natural Sciences
Sublinear Techniques for Discovering Trends in Long Sequences
Simon Fraser University, Vancouer, Canada
Algorithm design has traditionally involved carrying out a computation in the most efficient manner. The explosive increase in the size of inputs has made it necessary to ask, additionally, how we can carry out the computation given strictly bounded time and space. In this talk, we will discuss techniques for processing long data sequences using sublinear space and/or time, exploring poblems such as discovering whether a sequence is periodic and finding the period, testing whether a sequence has a repeating pattern and identifying such a pattern, and estimating the edit distance between two sequences.
The above problems require at least linear space and quadratic time; in this talk we present sublinear time (and/or space) algorithms to solve them. In particular, we can test for a repeating pattern and low edit distance in sublinear time, and check periodicity by looking at a sublinear number of characters from the input. For these tasks, since we do not have time to read the whole input, we have to resort to intelligent sampling, and even re-use the samples and sacrifice independence, to stay within stringent limits. This inevitably introduces some imprecision to the computation, and we explore the tradeoff between the use of resources and the precision of our computation. Since our sampling is oblivious, our techniques can also be viewed in the streaming framework.
Funda Ergun received her BS from Bilkent University, her MS from the Ohio State University and her Ph.D. from Cornell University, all in computer science. She is currently a faculty member at Simon Fraser University in Vancouver, Canada.
May 17, 2006, 13:40, FENS G032