h1alexbel/samples-filter
Command-line filter for GitHub repositories that contain "samples", instead of real project or framework or library
PythonMIT
Issues
- 14
Collect 500k repositories dataset
#125 opened by h1alexbel - 1
Split collected dataset into train/evaluation
#128 opened by h1alexbel - 1
Analyze 100 random ratings from test set
#130 opened by h1alexbel - 0
what makes SR so specific among other repositories?
#157 opened by h1alexbel - 2
Dockerize models training
#143 opened by h1alexbel - 0
embeddings.py:42-46: We generate embeddings for each...
#153 opened by 0pdd - 5
Investigate models based on unsupervised learning
#129 opened by h1alexbel - 0
- 0
Update models README
#145 opened by h1alexbel - 0
full pipeline from repository to vector
#149 opened by h1alexbel - 0
preprocessing objects
#146 opened by h1alexbel - 1
Generate embeddings
#142 opened by h1alexbel - 0
Example of repository as encoded vector
#140 opened by h1alexbel - 1
Put total number of `commits` for each repo in `model/data/train.csv` and `pipeline/input.csv`
#114 opened by h1alexbel - 3
transformer.py:36-40: Compose train.csv data as input...
#104 opened by 0pdd - 11
- 1
compose.py:32-39: Split README.md into important...
#119 opened by 0pdd - 1
- 1
- 2
integrate `ghminer` into dataset build pipeline
#137 opened by h1alexbel - 0
- 0
Unsupervised
#126 opened by h1alexbel - 0
- 2
- 0
feed.py:33-36: Feed `readme`, `last_commit`,...
#113 opened by 0pdd - 7
Train classification model on hugging face
#30 opened by h1alexbel - 3
- 2
- 1
- 6
Integrate multiple models in CLI
#105 opened by h1alexbel - 1
- 0
- 1
Preprocess readme in 4 steps
#71 opened by h1alexbel - 4
- 2
- 2
- 0
- 0
Pick random pack of repos from all collected
#73 opened by h1alexbel - 2
Pipeline for dataset building
#41 opened by h1alexbel - 0
combine.py:34-37: Combine all the csv files into...
#44 opened by 0pdd - 0
Document filtering method
#66 opened by h1alexbel - 0
Setup codecov.io
#58 opened by h1alexbel - 0
- 0
predict.py:33-37: Fetch pretrained model saved in...
#46 opened by 0pdd - 1
- 1
- 1
- 0
train.py:35-38: Document model training in...
#35 opened by 0pdd - 0
- 0
Escape readme in csv
#25 opened by h1alexbel