## Abstract

Finding the origin of short phrases propagating through the web has been formalized by Leskovec et al. (2009) as DAG PARTITIONING: given an arc-weighted directed acyclic graph on n vertices and m arcs, delete arcs with total weight at most k such that each resulting weakly-connected component contains exactly one sink—a vertex without outgoing arcs. DAG PARTITIONING is NP-hard. We show an algorithm to solve DAG PARTITIONING in O(2^{k}⋅(n+m)) time, that is, in linear time for fixed k. We complement it with linear-time executable data reduction rules. Our experiments show that, in combination, they can optimally solve DAG PARTITIONING on simulated citation networks within five minutes for k≤190 and m being 10^{7} and larger. We use our obtained optimal solutions to evaluate the solution quality of Leskovec et al.’s heuristic. We show that Leskovec et al.’s heuristic works optimally on trees and generalize this result by showing that DAG PARTITIONING is solvable in 2^{O(t2)}⋅n time if a width-t tree decomposition of the input graph is given. Thus, we improve an algorithm and answer an open question of Alamdari and Mehrabian (2012). We complement our algorithms by lower bounds on the running time of exact algorithms and on the effectivity of data reduction.

Original language | English |
---|---|

Pages (from-to) | 134-160 |

Number of pages | 27 |

Journal | Discrete Applied Mathematics |

Volume | 220 |

DOIs | |

Publication status | Published - 31 Mar 2017 |

## Keywords

- Algorithm engineering
- Evaluating heuristics
- Graph algorithms
- Linear-time algorithms
- Multiway cut
- NP-hard problem
- Polynomial-time data reduction
- KERNELIZATION
- MULTIVARIATE ALGORITHMICS
- COMPLEXITY