Number of nodes reachable from each node in a directed graph

→ Pay attention

Contest is running
2023 Post World Finals Online ICPC Challenge powered by Huawei
12 days
Register now »

→ Top rated

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	160
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	orz	146
9	pajenegod	145
9	SecondThread	145

View all →

→ Find user

→ Recent actions

Detailed →

vaibnak7's blog

Number of nodes reachable from each node in a directed graph

By vaibnak7, history, 4 years ago, In English

Given a directed graph, suppose we want to find the number of reachable nodes from each node of the graph then what is the best way to solve this problem ??

One obvious way to solve it is doing dfs from every node of the graph and counting how many nodes are getting visited, but the problem with this approach is that it is O(n^2) where n is the number of nodes in the graph

Then i thought of maybe if we can store at each node how many nodes are reachable and when queried give the answer based on the values of the neighbouring nodes but this will not be able to handle the case of overcounting as in the graph below.

So how to solve this ?

#graph, directed, #dfs

vaibnak7
4 years ago
23

Comments (22)

Show archived | Write comment?

tfg

4 years ago, # |

As far as I know the question of "is it possible to solve that faster than quadratic time" is an open problem.

→ Reply

yh11

11 months ago, # ^ |

← Rev. 2 →

-9

If the graph is a DAG, you can just topsort and do dp.
If not just use Kosaraju Algorithm to condense SCCs, then use topsort + dp.

This should be linear. Or am I missing something?

→ Reply

vgtcross

11 months ago, # ^ |

How would you solve the problem on a DAG using dp?

→ Reply

daCodda

11 months ago, # ^ |

This is a natural first instinct. However, this ends up overcounting.

→ Reply

tfg

11 months ago, # ^ |

Such dp counts the number of paths starting from some vertex, but sadly there might be more than one path with the same endpoints.

→ Reply

yh11

11 months ago, # ^ |

I see. Thanks.

→ Reply

BohdanPastuschak

4 years ago, # |

You can optimize $$$O(n^2)$$$ with bitsets. If doing straightforward, this probably will give MLE (if n = $$$10^5$$$, ML = 256/512MB), but you can try to divide all vertices on groups, and for each group G runs separate dfs: for each vertex V count how many of vertices U(from G) are reachable from V.

→ Reply

just_try_again

4 years ago, # |

Spoj Problem DAGCNT2 is similar to this

→ Reply

vaibnak7

4 years ago, # ^ |

Can you also tell about the correct approach to this problem

→ Reply

just_try_again

4 years ago, # ^ |

By using Bitsets overcounting of nodes can be prevented. Then it can be solved by toposort.

Here is the code with explanation

→ Reply

vaibnak7

4 years ago, # ^ |

Is there any use of topological sorting in this algorithm, or by using simple dfs also you can maintain the reach for every node

→ Reply

horiacool

18 months ago, # |

← Rev. 3 →

This can be done by first finding Strongly Connected Components (SCC), which can be done in O(|V|+|E|). Then, build a new graph, G', where each SCC is a node in the graph and each node has value which is the sum of the nodes in that SCC.

Given a graph G(V, E), we build G'(V', E') where:

V' = { U1, U2, ..., Uk | U_i is a SCC of the graph G }

E' = { (U, W) | there is node u in U and w in W such that (u, w) is in E }

This graph, G', is a DAG and the question becomes similar with finding the number of nodes reachable from each node in a DAG, which can be made easily via DFS:

int DFS(node v) {
    vis[v] = true
    reachable[v] = v.scc_size() // nodes reachable from that SCC, including themselves

    for u in v.children() {
        // nodes already visited were added via previously visited nodes
        if (vis[u] == false) {
            reachable[v] += DFS(u)
        }
    }

    return reachable[v]
}

for v in V'  '{
    if (indegree(v) == 0) {
        DFS(v)
    }
}

So for the original nodes from G we get very easily the number of reachable nodes:

for v in V {
    reachable_G[v] = reachable[containing_scc(v)]
}

Thus the final complexity is linear O(|V| + |E|) .

→ Reply

lrvideckis

18 months ago, # ^ |

It double counts

→ Reply

horiacool

18 months ago, # ^ |

← Rev. 2 →

Oh, yeah, sorry about that, it seems it doesn't cover all the cases, my bad :///

→ Reply

lrvideckis

18 months ago, # ^ |

All good, Note in the case that G is a DAG, this code will calculate reachable[v] instead as the number of paths starting at node v, which can grow exponentially

→ Reply

afylers

11 months ago, # |

Basically for every node it will be n^2. So total time complexity becomes n^3 right? @author

→ Reply

Abito

11 months ago, # ^ |

No, you are doing dfs for each node so it's $$$O(n^2)$$$

→ Reply

afylers

11 months ago, # ^ |

Doing dfs for a single node is n^2 in the worst case where we have all nodes connected with each other. So if we do dfs for all nodes, doesn’t it make it n*n^2 = n^3?

→ Reply