Till innehåll på sidan

Ciaran McCreesh: Finding little graphs inside big graphs (in parallel)

Tid: Fr 2017-03-10 kl 12.00

Plats: Room 4523, Lindstedtsvägen 5, KTH CSC

Medverkande: Ciaran McCreesh, University of Glasgow

Exportera till kalender

PRACTICALITIES

Lunch is served at 12:00 noon (register at this doodle   at the latest by Wednesday March 8, but preferably earlier). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for 1½-2 hours of more technical discussions.

 

ABSTRACT

I'll give a high level overview of the best practical algorithms for three families of subgraph problems: maximum clique (finding the largest group of people, all of whom are friends with everyone else in the group), subgraph isomorphism (finding a pattern graph inside a target graph), and maximum common subgraph (finding what two graphs have in common). These problems are all NP-hard, but nonetheless we can solve some instances involving graphs with tens of thousands of vertices, and millions of edges.  I'll discuss the kinds of graphs that make subgraph problems easy to solve in practice, and then explain how to generate ``really hard'' instances (and why this matters).  Finally, I'll present an overview of my research on parallelising these algorithms to exploit multicore hardware: with the right work-splitting mechanisms, we can get strong, consistent, reproducible speedups up to at least the 64 core mark. This may come as a surprise to those familiar with parallel SAT, CP or MIP solvers, so I'll explain what makes these subgraph algorithms different.

After the break, we will have an in-depth look at a state-of-the-art maximum clique algorithm due to Tomita et al. This algorithm is easy to understand from a correctness perspective, but it features several design choices which are hard to justify theoretically. We'll discuss three questions, and I'll show some empirical work which suggests that they share a common answer:

  •  How exactly does this algorithm behave on random graphs, and why?
  • Why is it so much better for the main loop in the algorithm to iterate backwards rather than forwards?
  • Why does introducing parallelism via randomised work-stealing behave so erratically, and how can we do better?