finos/git-proxy

Git Proxy Plugin: Detection of AI/ML usage (incl. weights, models etc.)

Opened this issue · 1 comments

ABOUT

This plugin aims to detect AI/ML usage in codebase based on checks such as :

  • Identifying files with extensions commonly associated with model weights files like .h5, .pb, .pt, etc.
  • File extensions for large datasets like .csv, .xlsx
  • Scans for code that uses or requires AI/ML libraries such as Tensorflow, Pytorch, Keras, etc.
  • Files containing configuration keys & information ( such as epochs, learning_rate ) for artificial intelligence models configuration in JSON / YAML files.
  • Files containing common AI/ML functions such as tokenize, train_model, predict, evaluate, transform.

A user can customize the detection criteria by specifying which parameters to check based on requirements.

Hey @JamieSlome, I am working on this and in progress of coming up with an implementation for the checks mentioned above.
Please let me know if you’d like any adjustments to this approach or any additional checks you’d recommend including.