beaunus/stanford-algs

Course 2, Week 1 (SCC) Test Cases are wrong

Closed this issue · 5 comments

The following files have wrong solutions.
input_mostlyCycles_14_64.txt
input_mostlyCycles_6_16.txt
input_mostlyCycles_40_3200.txt
input_mostlyCycles_3_8.txt
input_mostlyCycles_9_32.txt
input_mostlyCycles_11_32.txt
input_mostlyCycles_10_32.txt
input_mostlyCycles_22_200.txt
nput_mostlyCycles_5_16.txt
input_mostlyCycles_8_16.txt
input_mostlyCycles_4_8.txt

In some of them (though not all), I've identified the reason to be that your algorithm fails to identify unconnected nodes as SCCs, which they are by definition. See https://math.stackexchange.com/questions/148305/is-a-single-node-graph-a-strongly-connected-component.

For example, in 4_8.txt, node number 4 is isolated. Even though it does not appear in the list explicitly, the SCC algorithm assumes that all nodes 1 to n exist without gaps, which means that if one node does not appear in the list, it does not mean it doesn't exist, it only means it has no outgoing edges. 6_16.txt has the same problem, node 12 is isolated. These two examples were small enough for me to draw by hand.

An isolated node is a SCC by definition, and should appear as a '1' in the SCC output array. If a graph has more than 5 SCCs with 2 or more vertices, you would of course not see this in a top5. But this reveals a bug in your underlying SCC algorithm.

The others have a wrong answer, but I don't know what the reason for the error is.

@chrism216 Thank you for pointing this out. I will look into it this coming weekend.

@beaunus No problem!

I believe I may know the cause for the other problems. When I was first trying this assignment, my submission failed, but most of your tests were passing. But when I fixed my code and passed the assignment, suddenly your tests were failing. Maybe you did the exact same mistake I did:

  • In the SCC subroutine, when I iterated over the graph, I made the mistake of starting the iteration from the node with label len(graph), thinking that should be the value of the largest label.

  • However, since there are "implicit" nodes (as I pointed out in my first comment), the graph is actually larger than len(graph), because some nodes don't appear in the list. That means I was not actually starting the iteration from the node with the largest label. Since the ordering is crucial in the second pass (when you're iterating on nodes based on the finishing times of the first pass), I was getting wrong results.

When I changed my code to start the iteration from max(graph.keys()) instead (i.e. from the node with the largest label), I passed the assignment. Since I was passing your tests previous to fixing my bug but not after, you probably did the same mistake.

Here's a link to my repo (though implemented in Python): https://github.com/chrism216/Coursera_Stanford_Algorithms/blob/master/Course%202/Week%201/SCC.py

@chrism216
Thanks again for pointing this out. I have updated this batch in PR #50. Please let me know if it works as expected and I will merge to master.

Issue with isolated nodes is good now, according to my tests. The other ones aren't, though.