Thursday, November 19, 2020

Finding cycles in a graph. Johnsons method in Python.

 Whilst investigating a networked cellular automata I thought it would be useful to know what all the loops of connected cells in the network, (or "graph"),were.

A little googling and I found mention of Johnsons algorithm as being the most efficient and  decided to implement it. I searched and found that the most readable description was in Johnsons original paper.

Strongly Connected Components

You first need to split any graph into Strongly Connected Components, (sub-graphs where all nodes are interconnected), then run the algorithm on each SCC in turn. I had already written a Python example on Rosetta Code that used tarjans algorithm to split a graph into SCC's and incorporated that in the code below.

 Johnson's pseudocode

This is on page 3 of his paper and because it was written in the seventies, it is a little dated in its use of labels and in how it scopes variables. My initial translation stuck to the nested gunctions syntax, and used more nonlocal statements for name access.

Testing

In my searches I found a Wolfram Demonstration web page that uses the algorithm to show the cycles found in graphs where the graph generated can be controlled by sliders. I scraped a few graph examples from the page with their results and wrote a short, external script that turned the text cut-n-pasted from that page into Python representations of the adjacency matrix and resultant cycles. I also generated an adjacency list from the adjacency matrix .

Given an adjacency matrix list-of-lists of either zero or one then trhe following code created the adjacency list:

    adj_lst = [[i for i, cell in enumerate(row) if cell] 
               for row in adj_mat]

The examples with the web-scraped cycles is imported in then the following code generates the graph cycles and compares the result with that scraped from the Wolfram site.

Visualisation

 Nothing like diagrams to show the interconnection of nodes with directed edges, so I use graphviz run under the Spyder IDE for output.
For each example the following is shown in the output:

  1. A list of lists where each sublist shows node numbers that traverse a cycle in the graph. (It is assumed that the last node is connected to the first of its cycle).
  2. A graphviz plot of the full graph. (Nodes are constrained to be positioned in numerical order on one line).

The Code

# -*- coding: utf-8 -*-
"""
Created on Sat Nov 14 14:00:33 2020

@author: Paddy3118
"""

from collections import defaultdict
from pprint import pprint as pp
from graphviz import Digraph as GV   # conda install python-graphviz


class Graph_scc:
    "Directed Graph_scc Tarjan's strongly connected components algorithm"

    def __init__(self, name, adj_lst):
        self.name = name
        self.adj_lst = adj_lst
        # map node vertex to direct connections
        self.graph = {n:c for n, c in enumerate(adj_lst)}
        self.tarjan_algo()
        self.scc = sorted(sorted(s) for s in self.scc)

    def _visitor(self, this, low, disc, stack):
        '''
        Recursive function that finds SCC's
        using DFS traversal of vertices.

        Arguments:
            this        --> Vertex to be visited in this call.
            disc{}      --> Discovery order of visited vertices.
            low{}       --> Connected vertex of earliest discovery order
            stack       --> Ancestor node stack during DFS.
        '''

        disc[this] = low[this] = self._order
        self._order += 1
        stack.append(this)

        for neighbr in self.graph[this]:
            if neighbr not in disc:
                # neighbour not visited so do DFS recurrence.
                self._visitor(neighbr, low, disc, stack)
                low[this] = min(low[this], low[neighbr])  # Prior connection?

            elif neighbr in stack:
                # Update low value of this only if neighbr in stack
                low[this] = min(low[this], disc[neighbr])

        if low[this] == disc[this]:
            # Head node found of SCC
            top, new = None, []
            while top != this:
                top = stack.pop()
                new.append(top)
            self.scc.append(new)

    def tarjan_algo(self):
        '''
        Recursive function that finds strongly connected components
        using the Tarjan Algorithm and function _visitor() to visit nodes.
        '''

        self._order = 0         # Visitation order counter
        disc, low = {}, {}
        stack = []

        self.scc = []           # SCC result accumulator
        for vertex in sorted(self.graph):
            if vertex not in disc:
                self._visitor(vertex, low, disc, stack)
        self._disc, self._low = disc, low


    
def simple_cycles(gr, _print=False):

    def circuit(v):
        "https://www.cs.tufts.edu/comp/150GA/homeworks/hw1/Johnson%2075.PDF"
        
        nonlocal all_found, A, a_len, B, blocked, stack, s
        
        def unblock(u,  B, blocked):
            #nonlocal B, blocked
            
            blocked[u] = False
            for w in B[u].copy():
                B[u].discard(w)
                if blocked[w]:
                    unblock(w,  B, blocked)
        
        f = False       # Found
        stack.append(v)
        blocked[v] = True
        #%% L1
        for w in A[v]:
            if w == s:
                found = stack  # + [s]
                if min(found) == found[0]:  # remove rotations
                    all_found.append(found[::])
                    if _print:
                        print(f"found {found}")
                f = True
            elif not blocked[w]:
                if circuit(w):
                    f = True
        #%% L2
        if f:
            unblock(v,  B, blocked)
        else:
            for w in A[v]:
                B[w].add(v)
        stack.remove(v)
        
        return f  
    # end def circuit ...
    
    all_found = []
    A = gr.adj_lst.copy()       # adjacency list
    a_len = len(A)
    stack = []

    for scc in gr.scc:
        for s in scc:           # Start vertex (lowest numbered)
            blocked = [False for _ in range(a_len)]
            B = defaultdict(set)        # Block map 
            circuit(s)
            
    return all_found

#%% Graph with cycles

class Graph():
    def __init__(self, name, adj_lst):
        self.name = name
        self.adj_lst = adj_lst
        gr = Graph_scc(name, adj_lst)
        self.scc = gr.scc       # strongly connected components
        self.cycles = simple_cycles(gr)

    def _XXto_gv(self):
        """
        Output via Graphviz
        displayable in Spyder and Jupyter IDE's.

        Ref: https://graphviz.readthedocs.io/en/stable/manual.html
        """
        g = GV(self.name, filename='_tarjan.gv')
        g.body.extend(["layout=circo"])
        g.graph_attr['label'] = self.name
        # Colour strongly connected components
        groupcount = len(self.scc)
        for i, gnodes in enumerate(self.scc, 1):
            for gn in gnodes:
                g.node(f"N{gn}", color=f"{i/groupcount} 1 1")
        #
        #g.graph_attr['splines'] = 'ortho'
        #g.graph_attr['nodesep'] = '0.8'
        g.node_attr['shape'] = 'circle'
        for n, conns in enumerate(self.adj_lst):
            for c in conns:
                g.edge(f"N{n}", f"N{c}")
            #else:
            #    g.node(f"N{n}")
        return g

    def _to_gv(self):
        """
        Output via Graphviz
        displayable in Spyder and Jupyter IDE's.

        Ref: https://graphviz.readthedocs.io/en/stable/manual.html
        """
        g = GV(self.name, filename='_tarjan.gv')
        g.graph_attr['label'] = self.name
        #g.graph_attr['rankdir'] = 'LR'
        g.graph_attr['ranksep'] = '1.5'
        g.graph_attr['splines'] = 'ortho'
        
        # All nodes in one line using fake edges.
        g.node_attr['shape'] = 'circle'
        node_count = len(self.adj_lst)
        for i in range(node_count - 1):
            g.edge(f"N{i}", f"N{i + 1}", style='invis')
        #
        g.edge_attr['stye'] = 'solid'
        g.edge_attr['constraint'] = 'false'
        
        # Colour strongly connected components
        groupcount = len(self.scc)
        for i, gnodes in enumerate(self.scc, 1):
            for gn in gnodes:
                g.node(f"N{gn}", color=f"{i/groupcount} 1 1",
                       shape='circle', width='0.3', height='0', 
                       fontsize='10', margin='0.02')
        #
        #g.graph_attr['nodesep'] = '0.8'
        #g.node_attr['shape'] = 'circle'
        for n, conns in enumerate(self.adj_lst):
            for c in conns:
                g.edge(f"N{n}", f"N{c}", 
                       style="solid", constraint="false")
            #else:
            #    g.node(f"N{n}")
        return g

        
        

if 0: # __name__ == '__main__':
    
    adj_lst = [
      [2, 3],
      [2, 3],
      [0, 1],
      [1],
     ]
    adj_lst = [
      [],
      [0, 5],
      [1, 7],
      [],
      [],
      [7],
      [],
      [2, 4],
     ]
    
    gr = Graph_scc('Ex', adj_lst)

    pp(gr.adj_lst)
    cycles = simple_cycles(gr)
    print(cycles)
    
if __name__ == '__main__':
    # Examples scraped from https://demonstrations.wolfram.com/EnumeratingCyclesOfADirectedGraph/
    from graph_cycles_test_gen import example
    
    for n, ex in enumerate(example):
        nodes, cycles, adj_mat, adj_lst = ex
        gr = Graph(f"Ex{n}", adj_lst)
        print(gr.name, gr.cycles)
        assert gr.cycles == cycles
        display(gr._to_gv())
        print('\n')

The Output

Ex0 [[0, 3, 2], [2, 4, 3]]


Ex0N0N1N3N2N4N5



Ex1 [[0, 2], [0, 3, 1, 2], [1, 2], [1, 3]]


Ex1N0N1N2N3



Ex2 [[0, 5, 2], [0, 5, 4, 2]]


Ex2N0N1N5N2N6N3N4



Ex3 [[0, 1, 6], [0, 1, 6, 5], [0, 1, 6, 5, 2], [0, 1, 6, 5, 4, 2], [0, 5], [0, 5, 2], [0, 5, 2, 6], [0, 5, 4, 1, 6], [0, 5, 4, 2], [0, 5, 4, 2, 6], [1, 6, 5, 4], [2, 6, 5], [2, 6, 5, 4]]


Ex3N0N1N5N2N3N6N4



Ex4 [[1, 5, 7, 2], [2, 7]]


Ex4N0N1N2N5N3N7N4N6




END.


No comments:

Post a Comment