Faculty of Engineering and Natural Sciences
"Biomolecular Network Motif Countingand Discovery by Color Coding"
Protein protein interaction (PPI) networks of many organisms share global topological features such as degree distribution, k-hop reachability, betweenness and closeness.
Yet some of these networks can differ significantly from the others in terms of local structures: e.g. the number of specific network motifs can vary significantly among PPI networks.
Counting the number of network motifs provides a major challenge to comparing biomolecular networks.
In this paper we show how to apply the ``color coding'' technique to counting non-induced occurrences of subgraph topologies in the form of trees and bounded treewidth subgraphs.
Our algorithm can count all occurrences of motif G' with k vertices in a network $G$ with $n$ vertices in time polynomial with n, provided k=O(log n). We use our algorithm to obtain ``treelet'' distributions for k<=11 of available PPI networks of unicellular organisms S.cerevisiae, E.coli and H.pylori, which are all quite similar, and a multicellular organism C.elegans which is significantly different. Furthermore the treelet distribution of the unicellular organisms are similar to that obtained by the ``duplication model'' but are quite
different from that of the ``preferential attachment model''. The treelet distribution is robust w.r.t. sparsification with bait/edge coverage of %70 but differences can be observed when bait/edge coverage drops to %50.
April 10, 2008, 13:40, FENS G029