speediedan/deep_classiflie

Deep Classiflie is a framework for developing ML models that bolster fact-checking efficiency. As a POC, the initial alpha release of Deep Classiflie generates/analyzes a model that continuously classifies a single individual's statements (Donald Trump) using a single ground truth labeling source (The Washington Post). For statements the model deems most likely to be labeled falsehoods, the @DeepClassiflie twitter bot tweets out a statement analysis and model interpretation "report"

Python

Issues

create dc inference service that writes inference results to a distributed datastore (native ipfs or bigchaindb tbd)
#78 opened 4 years ago by speediedan
0
Add source, statement timestamp, inference scoring data amongst other metadata to inference service log table
#80 opened 4 years ago by speediedan
0
Switch from external amp to pytorch-native amp API
#76 opened 4 years ago by speediedan
0
decommission tweetbot, update documentation accordingly
#75 opened 4 years ago by speediedan
0
Switch from torchcontrib swa to core pytorch swa api
#77 opened 4 years ago by speediedan
0
publish deepclassiflie.org on ipfs using fleek etc.
#79 opened 4 years ago by speediedan
0
refactor about/readme to reduce duplicate images/improve load time
#84 opened 4 years ago by speediedan
0
dedup statement similarity cache using base-model similarity metric
#83 opened 4 years ago by speediedan
0
Add batchnorm update flag to swa update for nondefault models
#82 opened 4 years ago by speediedan
0
Take advantage of updated pytorch 1.6 memory profiling
#81 opened 4 years ago by speediedan
0
Transition deep_classiflie_db to submodule
#74 opened 4 years ago by speediedan
0
use intermediate activation checkpointing in pytorch to trade compute for increased batch size
#62 opened 5 years ago by speediedan
0
Look at pmf wrt words in uncertain or incorrect examples, include a separate learned temperature vector for changing weights appropriately
#63 opened 5 years ago by speediedan
0
Write core tests
#64 opened 5 years ago by speediedan
0
Create model card for latest deep classiflie model
#54 opened 5 years ago by speediedan
0
Implement gradient checkpointing for albert-xxl
#55 opened 5 years ago by speediedan
0
Determine alternate sentence classification models to test comparative perf on
#57 opened 5 years ago by speediedan
0
Do global analysis (e.g. capturing all attributions for all word dimensions in all sentences (huge), other less custom methods)
#58 opened 5 years ago by speediedan
0
Tensorboard sentence embedding
#53 opened 5 years ago by speediedan
0
Explore/analyze contextual embedding interpretations
#72 opened 4 years ago by speediedan
0
create word count-wise accuracy graph
#56 opened 4 years ago by speediedan
0
Build initial jekyll site
#71 opened 4 years ago by speediedan
0
Add bokeh viz to static jekyll frontend
#66 opened 4 years ago by speediedan
0
Depending on efficacy of previous multi-input tests, integrate context into ffn layers of albert
#67 opened 4 years ago by speediedan
0
Depending on efficacy of previous multiple input tests, learn simple perilinguistic contextual embedding wrt twitter
#68 opened 4 years ago by speediedan
0
Update primary dependencies (pytorch, captum etc.) and validate functionality
#69 opened 4 years ago by speediedan
0
Experiment w/ New Model Architectures Using Context-Dependent Data
#70 opened 4 years ago by speediedan
0
add level_0 thaw_schedule init for fine_tune_scheduler
#73 opened 4 years ago by speediedan
0
Add simple stmt source input to model (prior to embedding version)
#49 opened 5 years ago by speediedan
0
For global analysis Calculate average word representations on trained corpus and calculate which representations changed most from input
#61 opened 5 years ago by speediedan
0
Convert classes to data classes where appropriate
#59 opened 5 years ago by speediedan
0
move issue mgmt to github
#48 opened 5 years ago by speediedan
0
Add framework base dir (empty by default, set to env HOME)
#46 opened 5 years ago by speediedan
0
create/enable flag for non-db mode (for collaboration) by refactoring dataprep/training/inference and leveraging addnl cache
#47 opened 5 years ago by speediedan
0
albertxxl depth 2 training w/o dropout
#45 opened 5 years ago by speediedan
0
change classiflie_db from json driven config to yaml
#43 opened 5 years ago by speediedan
0
fix missing logo
#44 opened 5 years ago by speediedan
0
create encapsulating TrainingSession dataclass for training variables
#40 opened 5 years ago by speediedan
0
constrain 3D point space due to occasionally truncated annotation
#41 opened 5 years ago by speediedan
0
debug confusion matrix tensorboard rendering/'test_metrics' key prefix nesting issue in hparam section of trainer_pt
#42 opened 5 years ago by speediedan
0
albertbase without albert/post-albert dropout O2 depth 3 training
#37 opened 5 years ago by speediedan
0
Apply label smoothing
#38 opened 5 years ago by speediedan
0
Plot 3d word×dim×atttibution surface for top n attribution words
#39 opened 5 years ago by speediedan
0
expand cache and reporting to differentiate between twitter and non-twitter accuracy, then filter on that
#34 opened 5 years ago by speediedan
0
Retrain with temporal test split
#35 opened 5 years ago by speediedan
0
albertbase without albert dropout O2 depth 3 training
#36 opened 5 years ago by speediedan
0
Combine tweet html and graphs into image
#31 opened 5 years ago by speediedan
0
Add select model perf metrics to each statement
#32 opened 5 years ago by speediedan
0
Add word vector dimension sum viz (if intelligible)
#33 opened 5 years ago by speediedan
0
Similarity calculated by minimizing a linear function of a) l2 pairwise distance between sentence embeddings b) pairwise sigmoid output delta
#30 opened 5 years ago by speediedan
0