elbamos/largeVis

largeVis not on CRAN any more

gdkrmr opened this issue · 23 comments

Yes, very unfortunate!

Guys you can install pretty easily directly from github. That's actually better, because you'll get it optimized for your system.

The reason its not on CRAN any longer is that the CRAN build and test system includes an old, commercial C++ compiler for the Solaris platform that has bugs related to C++11 and OpenMP. I can't debug the compilation problem without that compiler, which is not freely available.

have you explained the problem to the cran maintainers? I don't think they will take down your package if you can't do anything about it.

Is it really a big deal whether it’s on cran at this point?

It's up to you if you want to deal with this but being on CRAN will give the package greater visibility. If I am looking for a package to do a certain task, the first thing I usually do is a text search on the CRAN page that lists all the packages.

The issue is rather dependencies, as I can't put any packages on Bioconductor or CRAN that depend on github largeVis. For example, I was working for a while on a package for custom distance metrics for largeVis (I have given up, it's probably impossible because Rcpp linking across packages / header / etc issues). It would possibly be feasible as a pull request in largeVis, but if the package can't be installed except from CRAN, it will be even less likely that I muster the energy to try this again.

I also wanted to include this as a method in https://github.com/gdkrmr/dimRed,

This is unfortunate, precisely because largeVis could easily be added as a dependency to other CRAN or Bioconductor packages if it was on CRAN.

Perhaps there's a resolution with the CRAN maintainers?

It feels like a shame the package is no longer available due to a rather small issue.

BTW, I'm happy to help out on this front, if it's useful.

If anyone has a solaris machine, that would be helpful. The issue that arose was a problem during compilation with the commercial Solaris compiler. The package makes extensive use of C++11 and OpenMP, so I've always thought the problem was a bug in that compiler, which I couldn't diagnose because it was commercial.

Anyway, what I need to do at this point is to check how the submission requirements have changed, try to compile against them and see if we still have the issue.

If anyone has a solaris machine, that would be helpful.

I was thinking Docker images may help us here, but I'm not convinced that's available.

I'm surprised they don't use a different compiler, like a newer g++ or clang.
Update: https://blog.r-hub.io/2020/05/14/checking-your-r-package-on-solaris/#crans-solaris-builder

CRAN uses two sets of compilers for their R package checks. Oracle Developer Studio (ODS)
is a commercial product that supports Oracle Solaris 10. In addition, OpenCSW
packages GCC, version 5.5.0 currently.

CRAN compiles R packages with ODS by default. If ODS is not able to compile a package,
they use GCC instead. GCC should be able to compile most (all?) CRAN R packages. 
Notably, Rcpp and all packages linking to it are compiled with GCC.

Ok I'm getting into this now. If you're still interested in helping, let me know. It looks like the principal thing I need to do is learn what improvements have been made in the R CI testing world (e.g., rhub) since the last time I played with this, and to disentangle any remaining relationship with the dbscan package. I've made a backoncran branch to facilitate the work, which is currently empty.

@elbamos I'm interested---especially as I'm interested in seeing what the cause of the issue was. I'm dealing with this on other fronts.

@evanbiederstedt If you want to look at the travis configuration, that would be helpful. I can take care of the solaris issue. (If you have a windows machine, I've never tried to get it to compile on Windows, because of the C++11 and OpenMP dependencies. Taking a look at that would also be helpful.) Thanks for nudging me to get back to this.

SamGG commented

Hi,
Windows10x64 R4.0.0

√  checking for file 'C:\Users\samgg\AppData\Local\Temp\RtmpUT09Qs\remotes685c1eb82df6\elbamos-largeVis-e51871e/DESCRIPTION'
-  preparing 'largeVis': (557ms)
√  checking DESCRIPTION meta-information ...
-  cleaning src
-  checking for LF line-endings in source and make files and shell scripts (343ms)
-  checking for empty or unneeded directories
-  building 'largeVis_0.2.2.tar.gz'
   
* installing *source* package 'largeVis' ...
** using staged installation
** libs
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c RcppExports.cpp -o RcppExports.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c checkfunctions.cpp -o checkfunctions.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c dbscan.cpp -o dbscan.o
dbscan.cpp: In member function 'std::__cxx11::list<long long int> DBSCAN::regionQuery(long long int&) const':
dbscan.cpp:37:13: warning: ignoring return value of 'arma::SpMat<eT>::const_iterator arma::SpMat<eT>::const_iterator::operator++(int) [with eT = double]', declared with attribute warn_unused_result [-Wunused-result]
           it++) {
             ^~
In file included from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/armadillo:638,
                 from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/RcppArmadilloForward.h:49,
                 from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/RcppArmadillo.h:31,
                 from dbscan.cpp:1:
C:/opt/R/R-4.0.0/library/RcppArmadillo/include/armadillo_bits/SpMat_iterators_meat.hpp:176:1: note: declared here
 SpMat<eT>::const_iterator::operator++(int)
 ^~~~~~~~~
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c denseneighbors.cpp -o denseneighbors.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c distance.cpp -o distance.o
distance.cpp: In function 'arma::vec fastSparseDistance(const ivec&, const ivec&, const sp_mat&, const string&, bool)':
distance.cpp:81:9: warning: 'distanceFunction' may be used uninitialized in this function [-Wmaybe-uninitialized]
 #pragma omp parallel for shared (xs)
         ^~~
distance.cpp: In function 'arma::vec fastDistance(Rcpp::IntegerVector, Rcpp::IntegerVector, const mat&, const string&, Rcpp::Nullable<Rcpp::Vector<14, Rcpp::PreserveStorage> >, bool)':
distance.cpp:60:9: warning: 'distanceFunction' may be used uninitialized in this function [-Wmaybe-uninitialized]
 #pragma omp parallel for shared (xs)
         ^~~
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c edgeweights.cpp -o edgeweights.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c gradients.cpp -o gradients.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c hdbscan.cpp -o hdbscan.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c hdbscanobj.cpp -o hdbscanobj.o
hdbscanobj.cpp: In member function 'void HDBSCAN::buildHierarchy(const std::vector<std::pair<double, long long unsigned int> >&, const unsigned int&, const uword*)':
hdbscanobj.cpp:104:32: warning: comparison of integer expressions of different signedness: 'const uword' {aka 'const long long unsigned int'} and 'int' [-Wsign-compare]
   if (minimum_spanning_tree[n] == NA_INTEGER) continue;
In file included from hdbscanobj.cpp:3:
primsalgorithm.h: In instantiation of 'VIDX* PrimsAlgorithm<VIDX, D>::run(const sp_mat&, const IntegerMatrix&, Progress&, const VIDX&) [with VIDX = long long unsigned int; D = double; arma::sp_mat = arma::SpMat<double>; Rcpp::IntegerMatrix = Rcpp::Matrix<13>]':
hdbscanobj.cpp:146:76:   required from here
primsalgorithm.h:53:4: warning: ignoring return value of 'arma::SpMat<eT>::const_iterator arma::SpMat<eT>::const_iterator::operator++(int) [with eT = double]', declared with attribute warn_unused_result [-Wunused-result]
    for (auto it = edges.begin_col(v);
    ^~~
In file included from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/armadillo:638,
                 from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/RcppArmadilloForward.h:49,
                 from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/RcppArmadillo.h:31,
                 from largeVis.h:12,
                 from hdbscanobj.cpp:1:
C:/opt/R/R-4.0.0/library/RcppArmadillo/include/armadillo_bits/SpMat_iterators_meat.hpp:176:1: note: declared here
 SpMat<eT>::const_iterator::operator++(int)
 ^~~~~~~~~
In file included from primsalgorithm.h:3,
                 from hdbscanobj.cpp:3:
minindexedpq.h: In instantiation of 'PairingHeap<V, D>::PairNode* PairingHeap<V, D>::combineSiblings(PairingHeap<V, D>::NodePointer) [with V = long long unsigned int; D = double; PairingHeap<V, D>::NodePointer = PairingHeap<long long unsigned int, double>::PairNode*]':
minindexedpq.h:74:15:   required from 'const V PairingHeap<V, D>::pop() [with V = long long unsigned int; D = double]'
primsalgorithm.h:44:9:   required from 'VIDX* PrimsAlgorithm<VIDX, D>::run(const sp_mat&, const IntegerMatrix&, Progress&, const VIDX&) [with VIDX = long long unsigned int; D = double; arma::sp_mat = arma::SpMat<double>; Rcpp::IntegerMatrix = Rcpp::Matrix<13>]'
hdbscanobj.cpp:146:76:   required from here
minindexedpq.h:61:9: warning: comparison of integer expressions of different signedness: 'int' and 'unsigned int' [-Wsign-compare]
   if (j == numSiblings - 3) compareAndLink (treeArray[j], treeArray[j + 2]);
       ~~^~~~~~~~~~~~~~~~~~
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c hdcluster.cpp -o hdcluster.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c largeVis.cpp -o largeVis.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c minpq.cpp -o minpq.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c neighbors.cpp -o neighbors.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c optics.cpp -o optics.o
optics.cpp: In member function 'void OPTICS::getNeighbors(const long long int&, PairingHeap<long long int, double>&)':
optics.cpp:50:28: warning: ignoring return value of 'arma::SpMat<eT>::const_iterator arma::SpMat<eT>::const_iterator::operator++(int) [with eT = double]', declared with attribute warn_unused_result [-Wunused-result]
                          it++) {
                            ^~
In file included from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/armadillo:638,
                 from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/RcppArmadilloForward.h:49,
                 from C:/opt/R/R-4.0.0/library/RcppArmadillo/include/RcppArmadillo.h:31,
                 from optics.cpp:1:
C:/opt/R/R-4.0.0/library/RcppArmadillo/include/armadillo_bits/SpMat_iterators_meat.hpp:176:1: note: declared here
 SpMat<eT>::const_iterator::operator++(int)
 ^~~~~~~~~
In file included from optics.cpp:2:
minindexedpq.h: In instantiation of 'PairingHeap<V, D>::PairNode* PairingHeap<V, D>::combineSiblings(PairingHeap<V, D>::NodePointer) [with V = long long int; D = double; PairingHeap<V, D>::NodePointer = PairingHeap<long long int, double>::PairNode*]':
minindexedpq.h:74:15:   required from 'const V PairingHeap<V, D>::pop() [with V = long long int; D = double]'
optics.cpp:108:28:   required from here
minindexedpq.h:61:9: warning: comparison of integer expressions of different signedness: 'int' and 'unsigned int' [-Wsign-compare]
   if (j == numSiblings - 3) compareAndLink (treeArray[j], treeArray[j + 2]);
       ~~^~~~~~~~~~~~~~~~~~
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c registration.cpp -o registration.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c sparse.cpp -o sparse.o
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c test-runner.cpp -o test-runner.o
In file included from C:/opt/R/R-4.0.0/library/testthat/include/testthat.h:1,
                 from test-runner.cpp:7:
C:/opt/R/R-4.0.0/library/testthat/include/testthat/testthat.h: In function 'std::ostream& Catch::cout()':
C:/opt/R/R-4.0.0/library/testthat/include/testthat/testthat.h:140:1: warning: visibility attribute not supported in this configuration; ignored [-Wattributes]
 }
 ^
C:/opt/R/R-4.0.0/library/testthat/include/testthat/testthat.h: In function 'std::ostream& Catch::cerr()':
C:/opt/R/R-4.0.0/library/testthat/include/testthat/testthat.h:147:1: warning: visibility attribute not supported in this configuration; ignored [-Wattributes]
 }
 ^
"C:/rtools40/mingw64/bin/"g++  -std=gnu++11 -I"C:/opt/R/R-40~1.0/include" -DNDEBUG  -I'C:/opt/R/R-4.0.0/library/Rcpp/include' -I'C:/opt/R/R-4.0.0/library/RcppProgress/include' -I'C:/opt/R/R-4.0.0/library/RcppArmadillo/include' -I'C:/opt/R/R-4.0.0/library/testthat/include'     -fopenmp -DARMA_64BIT_WORD -DNDEBUG   -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign -c testcfunctions.cpp -o testcfunctions.o
C:/rtools40/mingw64/bin/g++ -shared -s -static-libgcc -o largeVis.dll tmp.def RcppExports.o checkfunctions.o dbscan.o denseneighbors.o distance.o edgeweights.o gradients.o hdbscan.o hdbscanobj.o hdcluster.o largeVis.o minpq.o neighbors.o optics.o registration.o sparse.o test-runner.o testcfunctions.o -fopenmp -fopenmp -lgfortran -lm -lquadmath -LC:/opt/R/R-40~1.0/bin/x64 -lRlapack -LC:/opt/R/R-40~1.0/bin/x64 -lRblas -LC:/opt/R/R-40~1.0/bin/x64 -lR
installing to C:/opt/R/R-4.0.0/library/00LOCK-largeVis/00new/largeVis/libs/x64
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
  converting help for package 'largeVis'
    finding HTML links ... done
    as.dendrogram.hdbscan                   html  
Rd warning: C:/Users/samgg/AppData/Local/Temp/Rtmp4oaUCt/R.INSTALL644c59bb522b/largeVis/man/as.dendrogram.hdbscan.Rd:14: file link 'as.dendrogram' in package 'stats' does not exist and so has been treated as a topic
    buildEdgeMatrix                         html  
    buildWijMatrix                          html  
    distance                                html  
    ggManifoldMap                           html  
    gplot                                   html  
    hdbscan                                 html  
    largeVis-package                        html  
    largeVis                                html  
    lof                                     html  
    lv_dbscan                               html  
    lv_optics                               html  
    manifoldMap                             html  
    manifoldMapStretch                      html  
    neighborsToVectors                      html  
    projectKNNs                             html  
Rd warning: C:/Users/samgg/AppData/Local/Temp/Rtmp4oaUCt/R.INSTALL644c59bb522b/largeVis/man/projectKNNs.Rd:69: file link 'set.seed' in package 'base' does not exist and so has been treated as a topic
    randomProjectionTreeSearch              html  
    sgdBatches                              html  
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (largeVis)

Hi @SamGG thanks! I will ask you to try it again when we get closer to a release.

SamGG commented

My pleasure :-)

I'm considering dropping DBSCAN and OPTICS from the package. The only advantage to this over the implementations in the dbscan package, is that it takes advantage of the fast nearest neighbor search.

And it really complicates things, because they link against the dbscan package.

Does anyone think that there are any users of my dbscan or optics implementations?

Hi @elbamos

Thanks again for taking the time to look into this---I'm interested in finding out what's going on.

@evanbiederstedt If you want to look at the travis configuration, that would be helpful. I can take care of the solaris issue. (If you have a windows machine, I've never tried to get it to compile on Windows, because of the C++11 and OpenMP dependencies. Taking a look at that would also be helpful.) Thanks for nudging me to get back to this.

Sure, I'll make a PR :)

SamGG commented

Concerning DBSCAN and Co, I think you did great improvements that are worth keeping. If you really want to externalize these codes, either propose a PR to the dbscan package, either start another package. That would be a pity if that code would not be easily usable. Data are growing and fast approaches are needed.

Ok I think I have this worked-out now. Would you folks mind giving a try to the version currently in the /feature/backoncran branch? In particular, let me know if you seen any performance impacts.

Hi @elbamos

Sorry---I got a bit distracted by life.

Would you folks mind giving a try to the version currently in the /feature/backoncran branch? In particular, let me know if you seen any performance impacts.

I was able to install this directly on Mac OS 10.15.5 with OpenMP configured, using /usr/local/opt/llvm/bin/clang++ -fopenmp -std=gnu++11

I've been going through the vignettes, where I had no problems. I'm trying the following "larger example" from the vignettes, where projectKNNs() appears to be taking some time. Do you have a sense how long this should take? (The threading feature is working though.)

require(largeVis,quietly = TRUE)
load(system.file(package = "largeVis", "vignettedata/vignettedata.Rda"))
pathToGraphFile='com-youtube.ungraph.txt.gz'

pathToCommunities='com-youtube.top5000.cmty.txt.gz'

youtube <- readr::read_tsv(pathToGraphFile, skip=4, col_names=FALSE)
youtube <- as.matrix(youtube)
youtube <- Matrix::sparseMatrix(i = youtube[, 1],
                                j = youtube[, 2],
                                x = rep(1, nrow(youtube)), 
                                dims = c(max(youtube), max(youtube)))
youtube <- youtube + t(youtube)
communities <- readr::read_lines(pathToCommunities)
communities <- lapply(communities, 
                      FUN = function(x) as.numeric(unlist(strsplit(x, "\t"))))
community_assignments <- rep(0, 
                             nrow(youtube))
for (i in 1:length(communities)) community_assignments[communities[[i]]] <- i
wij <- buildWijMatrix(youtube)

## SLOW youTube_coordinates <- projectKNNs(youtube) ##

youTube_coordinates <- projectKNNs(youtube, threads=8)

youTube_coordinates <- data.frame(scale(t(youTube_coordinates)))
colnames(youTube_coordinates) <- c("x", "y")
youTube_coordinates$community <- factor(community_assignments)


youTube_coordinates$alpha <- factor(ifelse(youTube_coordinates$community == 0, 0.05, 0.2))
ggplot(youTube_coordinates, aes( x = x, 
                      y = y, 
                      color = community, 
                      alpha = alpha, 
                      size = alpha)) +
  geom_point() +
  scale_alpha_manual(values = c(0.005, 0.2), guide = FALSE) +
  scale_size_manual(values = c(0.03, 0.15), guide = FALSE) +
  scale_x_continuous("", 
                     breaks = NULL, limits = c(-2.5,2.5)) +
  scale_y_continuous("", 
                     breaks = NULL, limits = c(-2.5,2.5)) +
  ggtitle("YouTube Communities")

Another question: Have you tried uploading with the CRAN crew?