akielaries/openGPMP

Updated TODO

Opened this issue · 0 comments

Generic issue for keeping track of some TODOs in no specific order of importance

Important

I spent a lot of time getting boilerplate implementations to compile without warnings meaning there will be algorithms not implemented entirely correct and not producing expected results. SO, to combat this I will need to step thru each module's components (classes and methods, functions, dependency chains, call trees, etc) and modify what is needed

  • Overall optimization. So far the linear algebra module's matrix-matrix operations contain an optimized kernel for SSE ISAs. I want to add some optimizations in other modules of this project

    • For matrix and vector operations add optimized assembly kernels for AVX / AVX2 ISAs and verify compatibility with SSE2 ISAs
    • Make use of the matrix / vector optimizations in the machine learning module
    • Beyond matrix / vector operations target optimization in the entire linear algebra module
    • Explore how to optimize the number theory module as well and what this could possibly look like
      • For this, dig into each module and individual units of classes that can be turned into standalone assembly kernel files. To test with this for a soft estimate/boilerplate disassembly naive programs with different optimization flags (-O2, -O3, -fopt-info-vec-optimized, -mavx, -march=, etc,etc) and make these callable from the source C++ interface
      • Instead of creating assembly kernels for every function be meticulous and identify parts of the code where most time is spent and optimize these portions
      • Verify, cleanup, and optimize the following (compare against Eigen, Armadillo, Numpy, Scipy, etc?) using the above approach:
        • linalg/eigen.cpp
        • linalg/svd.cpp
        • linalg/tensor.cpp TODO: this needs to be implemented correctly
        • nt/factorization.cpp
        • nt/prime_gen.cpp
        • nt/prime_test.cpp
        • nt/random.cpp
        • calculus/differential.cpp
        • stats/cdfs.cpp
        • stats/pdfs.cpp
        • stats/resampling.cpp
    • More areas but the above will suffice for now....
  • Improving workflows

    • In general, get all workflows to pass. As of right now the README shows "no status" for workflow badges
      • EDIT: this was due to workflows being triggered from a 'master' file (.github/workflows/opengpmp.yml)
    • Edit the documentation building workflow to not run on every commit. Instead this should be triggered when there are changes to the docs/ directory and the source code sitting in the include/ and modules directories
    • There may be other workflows that need to be edited to they only run on specific triggers
  • tinygpmp implementation could ideally make use of existing C++ code for ease of development but most embedded platforms make use of C already so re-implementation may be worth it? I don't plan for a large amount of users to consume this so C++ will be ideal for touching once and never again.

    • The issue with much of the C++ implementation is the use of large integer sizes in many cases 64 bit integers as well as 32 bit. To target embedded platforms this should be made dynamic somehow??
    • Similar issue as above in regard to memory usage. In the matrix operations specifically DGEMM we make use of allocating buffers in memory for matrices in advance. This assumes the machine has resources to do so and is fairly lazy but the speed of this implementation comes from that. To tackle this memory should likely not be allocated and a different approach should be taken
  • benchmarks which could probably make use of google/benchmark

    • Comparisons against BLAS and linalg libraries (BLIS, OpenBLAS, Eigen, armadillo, etc)
      • Matrix
      • Vector
      • Eigen
      • Linear Systems
      • Single Value Decomposition
      • Tensor
    • Number theory libs like NTL
      • Prime generation and test
      • Arithmetic
      • More...
    • Eventually machine learning libraries as well when we get there...
  • Code duplication and cleanup

    • Cleanup misuses of OOP and classes in the project. Changes classes that are empty with many methods to either not be classes or use static methods that can be called without object instantiation
    • Remove duplicated code in test suite to start
    • Remove/refactor linalg matrix operations. As of right now there are a bunch of array and std::vector specific files for specific ISAs. This should be cleaned up along with a formal interface for the DGEMM implementation
      • Remember the calling tree for this, we want a main interface that calls the ASM kernels
  • language bindings

    • pygpmp is currently broken and needs more tailoring. In addition to SWIG make use of Boost Python for tailoring bindings correctly. Ideally we want as much of the wrapping and code generations to be left for SWIG to do and using Boost Python in places where more customization is needed.
      • Get each module working and wrapped correctly
      • Create and verify working samples for the wrapped code
    • gpmp.jl is under progress and I would like to make use of wrapit for automatic code generation for as much of the work as possible and specific tailoring in cases where needed.
      • Get each module working and wrapped correctly
      • Create and verify working samples for the wrapped code
    • These will be the only two languages I want to target for bindings