kaist-cp/cs500

Anyone help me to understand BFS cost

Closed this issue · 21 comments

image

As I understand that combine is somehow ~ (append & filter) thus W(combine) = W(append) + W(filter).
But I don't understand why at the end, W = O(|V|^2) and S = O(|V|.log(|V|))

I used the table in Chapter 15.
Thanks

Hi!
Actually, I am also really confused about this. I tried to spend some time thinking about this but I feel I don't understand it completely. I will write my attempt:

First, in each call of BFSInner we are exploring one level in a tree or one "level of frontier" in a regular graph. So if we have a tree, BFSInner would be called log |V| times.

As the professor said in class, the main cost of the function is on line 9. Here we have two things:

  • aux = N_{G} (F)
  • F' = aux - X

The graph is represented as List<List> for example:
< , <....>, ... > In other words, it is the adjacency matrix and its size is |V|*|V|.

The cost of aux = N_{G} (F) is |F|*|V| because we need to explore for each vertex in the frontier set its neighbors (we do need to check all of them, that's why *|V| ) because we are using asymptotic notation, I think we can say that the cost of this O(|V|^2) please note F is a subset of V

And here comes the other question I have: if I am right, then the total cost is O(log(|V|) * |V|^2) assuming the input is a tree. If it is a regular graph, I am not sure how many times BFSInner is called. In any case, this is not what is written in the lecture notes, so it is obvious I am missing something.

Any feedback is more than welcome :)

For the work of flatten, each invocation of flatten A causes O(sum(|A_i|)) amount of work. Because each vertex is in frontier set F exactly once, and while flattening, each vertex contributes a neighborhood set of size |V| (set representation of bool sequence), total work caused by flattening is O(|V|^2).
Other operations like is_empty, union, minus has O(|V|) work for each invocation. Because BFSInner can be invoked at most |V| times, total work of these operations is also O(|V|^2).
Span can be computed in the same way. It is dominated by is_empty and flatten.

Hi, thanks for your comment. I will try to give you constructive feedback about your argument while we wait for the professor.

First, I think we need to give at least a high-level explanation of the implementation of N_{G} (F) to state that it uses flatten (I know the lecture notes states that it needs faltten and minus) and its work is O(|V|^2) (at least in the exam)
For me, your proof of the work of line 9 is not enough.

  • "vertex is in frontier set F exactly once" that's right and I hope we don't need to prove this in the exam ;)

  • "flattening it contributes a neighborhood set of size |V|" that's right

  • "total work caused by flattening is O(|V|^2)" how do you infer this? you said before that flattening takes |V| but now it takes |V|^2. Why? The work of flatten is as follows, I don't see any ^2
    image

  • "BFSInner can be invoked at most |V| times" agree with this

  • "total work of these operations is also O(|V|^2)." I don't agree with this. You said that line 9 takes O(|V|^2) and it is executed O(|V|) times, so the total work of the whole algorithm should be O(|V|^3), right?

Thank you for your comment ~ It helped me to rethink this problem and ask myself to be more mathematical rigorous, important for the exam ;) Let me know if I misundertood something of your comment or if you don't agree with my arguments :)

By the total work I meant the sum of works over every BFSInner invocation. For i-th loop, flatten takes O(|F_i|×|V|) work. So, the total work of flatten over every step results O(sum(|F_i|×|V|)) = O(|V|^2). Sorry for the confusing terminology.

Hi again!
I am almost convinced now haha I mean I understand everything you said and makes sense to me but there is one detail I am still missing. For me, the work of line 9 is still not proven because we don't know the implementation of N_{G} (F). For example, why do we need flatten?

N_{G} (F) = U (for all u in F) N_{G} (u) -- by definition
so we need the function union A B provided in the lecture notes.
and actually, please note we have to do |Fi| unions in each iteration [ because of this U (for all u in F) ]

union needs the function combine and I think combine does not need flatten.

Thank you!

reduce operation is used to perform multiple unions efficiently in parallel. As you can find in the lecture note, flatten is a set operation that reduces given set of sets with union.
I think the name flatten is misleading because it has little relationship with flatten in sequence operation. Maybe both are similar in a way that they requires set of sets or sequence of sequences?

ahhhhhhhh I was confused about flatten. For some reason, I kept on thinking it was sequence flatten. Now I think I fully understand the work of the algorithm. Thank you so much!!

so I guess the Span is (|V|.log|V| ) because in line 9. The Span for each invocation is log|V| (because of flatten cost), am I right ?

I think span is dominated by isEmpty, because it has span of O(lg|V|) in each step. Set flatten is O(lg|F_i|), which is OK to say O(lg|V|) but slightly less than that.

Oh. but I think is_empty(F) = -(reduce v false F) (where F is subset of V) so S = O(log |V|). How do you know is_empty dominated ?

There are 4 operations in each BFSInner invocation: isEmpty, union, flatten and minus. Each has O(lg|V|), O(1), O(lg|F_i|), O(1) span. Having largest span, isEmpty surely dominates overall span in asymptotic sense.
isEmpty is invoked in every step, and max. # of steps is |V|. So the span of BFSInner is |V| × O(lg|V|) = O(VlgV)

Thanks.

anyway. I'm still unsure about how union, minus has O(1). Can you explain a little bit ?

For a boolean sequence representation of sets, union and minus can be done by applying boolean operation (e.g. boolean or for union) in element-wise manner. They can be all done in parallel (like sequence operation map) so it has a constant span.

Hi!
I don't understand how you prove those spans. I think the span of line 5 is O(log|Fi|) and the span of line 9 is O(log|N_{G} (Fi)). I would also say that both span can be upper-bounded by O(log|V|)
Thx

@HaritzPuerto yup, but I also want to know more about the union, the minus operation. Do you have a word to share ?

Hmmm union a b = combine "or" a b = tabulate (labmda i. ai bi) min (|a|, |b|) the span of tabulate is just 1 + the max span of f(i). f(i) here is the lambda function so I think the span is just 1
Minus it's the same I think
Please let me know if I am wrong. Thx

isEmpty(Fi) = not(reduce 'bool or' Fi), and Fi is a bool sequence of length |V|, so it has O(lgV) span.
N_G^+(Fi) = flatten(N_G(Fi)) = reduce union N_G(Fi), where N_G(Fi) is a sequence of neighborhoods of vertices in Fi (or, N_G(Fi) = map (lambda v. (Row of v in adjacent matrix)) Fi). N_G(Fi) has |Fi| elements and union has constant span so flatten has O(lg|Fi|) span.

Hmmm union a b = combine "or" a b = tabulate (labmda i. ai bi) min (|a|, |b|) the span of tabulate is just 1 + the max span of f(i). f(i) here is the lambda function so I think the span is just 1
Minus it's the same I think
Please let me know if I am wrong. Thx

Can you explain more clearly about tabulate (lambda i a_i b_i) min (|a|, |b|) ?
I do understand tabulate f n. I also understand |a| |b| is.
But how the function lambda i a_i b_i here work? what for ?
And as the whole, in this case, when a, b here is boolean set. I don't understand how it works ?
Thank you.

Hmmm union a b = combine "or" a b = tabulate (labmda i. ai bi) min (|a|, |b|) the span of tabulate is just 1 + the max span of f(i). f(i) here is the lambda function so I think the span is just 1
Minus it's the same I think
Please let me know if I am wrong. Thx

Can you explain more clearly about tabulate (lambda i a_i b_i) min (|a|, |b|) ?
I do understand tabulate f n. I also understand |a| |b| is.
But how the function lambda i a_i b_i here work? what for ?
And as the whole, in this case, when a, b here is boolean set. I don't understand how it works ?
Thank you.

combine "or" a b = tabulate (labmda i. ai "or" bi) min (|a|, |b|).
I think "or" is omitted.

Ok. I still don't understand. Let do something more clear. Let get an example:
(1) as the lecture note wrote.
A = {v1, v2, v3}
B = {v3, v4, v6}
so min (|a|, |b|) = 3.
lambda i. Ai OR Bi mean : v1 or v3 ??? what ? vi already= True right ?

Or (2)
They assume that there are |V| vertices in the full graph, a set A is a sequences <v0, v1, v2 ... v|V-1|> and v0 could be either 0 or 1.
And the meaning of set A is:
if vi = 1 so set A contain the vertex 'vi'.
if vi = 0 so set A doesn't contain the vertex 'vi'

If set is defined/structured as (2):
so: |A| = |B| ( = |V| )
and now, union work like this:
(let call C is the union of A and B)
For each vertex i:
Ci = Ai or Bi.
and thus, |C| = |A| = |B| (|V|) and the for-loop could run in fully parallel. so span is 1.
E.G: There are 5 nodes.
set A = <1,0,1,1,0> ;
set B = <0,1,1,0,0> ;
==> C = A union B = <1,1,1,1,0>
Am I wrong some thing ?