mhahsler/stream

Prevention of collapsing clusters (DBStream algorithm)

Dennis1989 opened this issue · 1 comments

Within your implementation it seems to me that you only allow the movement of micro-cluster centers if all neighbouring micro-cluster centers are not closer than radius R:

if (check_dist(new_centers)) {
            for (std::size_t i = 0; i < new_centers.size(); i++)
              mcs[inside[i]].center = new_centers[i];

            if (debug)
              Rcpp::Rcout << "\tNO COLLISIONS - centers moved!" << std::endl;
          }

However, this implies that if there exists one micro-cluster center that violates this restriction, none of the others are moved as well (although they could, because they do not violate the radius restriction)

As far as I can say, this implementation differs from the original pseudocode.

Kind regards,

Dennis

I think you are correct. The code checks if the centers collapse for a new point and prevents this if so. If there are two MCs involved then the pseudocode in the paper produces always the same result as the code. However, if there are more MCs then the pseudocode might still move one MC while the code will always block all involved movements. I need to run experiments how often that happens, but I think it will probably not have a noticeable impact. Sorry about this slight inconsistency.