speediedan/deep_classiflie
Deep Classiflie is a framework for developing ML models that bolster fact-checking efficiency. As a POC, the initial alpha release of Deep Classiflie generates/analyzes a model that continuously classifies a single individual's statements (Donald Trump) using a single ground truth labeling source (The Washington Post). For statements the model deems most likely to be labeled falsehoods, the @DeepClassiflie twitter bot tweets out a statement analysis and model interpretation "report"
Python
Issues
- 0
create dc inference service that writes inference results to a distributed datastore (native ipfs or bigchaindb tbd)
#78 opened by speediedan - 0
Add source, statement timestamp, inference scoring data amongst other metadata to inference service log table
#80 opened by speediedan - 0
- 0
- 0
- 0
- 0
- 0
- 0
- 0
- 0
Transition deep_classiflie_db to submodule
#74 opened by speediedan - 0
use intermediate activation checkpointing in pytorch to trade compute for increased batch size
#62 opened by speediedan - 0
Look at pmf wrt words in uncertain or incorrect examples, include a separate learned temperature vector for changing weights appropriately
#63 opened by speediedan - 0
Write core tests
#64 opened by speediedan - 0
- 0
- 0
Determine alternate sentence classification models to test comparative perf on
#57 opened by speediedan - 0
Do global analysis (e.g. capturing all attributions for all word dimensions in all sentences (huge), other less custom methods)
#58 opened by speediedan - 0
Tensorboard sentence embedding
#53 opened by speediedan - 0
- 0
create word count-wise accuracy graph
#56 opened by speediedan - 0
Build initial jekyll site
#71 opened by speediedan - 0
Add bokeh viz to static jekyll frontend
#66 opened by speediedan - 0
Depending on efficacy of previous multi-input tests, integrate context into ffn layers of albert
#67 opened by speediedan - 0
Depending on efficacy of previous multiple input tests, learn simple perilinguistic contextual embedding wrt twitter
#68 opened by speediedan - 0
Update primary dependencies (pytorch, captum etc.) and validate functionality
#69 opened by speediedan - 0
- 0
- 0
- 0
For global analysis Calculate average word representations on trained corpus and calculate which representations changed most from input
#61 opened by speediedan - 0
- 0
move issue mgmt to github
#48 opened by speediedan - 0
- 0
create/enable flag for non-db mode (for collaboration) by refactoring dataprep/training/inference and leveraging addnl cache
#47 opened by speediedan - 0
albertxxl depth 2 training w/o dropout
#45 opened by speediedan - 0
- 0
fix missing logo
#44 opened by speediedan - 0
- 0
- 0
debug confusion matrix tensorboard rendering/'test_metrics' key prefix nesting issue in hparam section of trainer_pt
#42 opened by speediedan - 0
- 0
Apply label smoothing
#38 opened by speediedan - 0
- 0
expand cache and reporting to differentiate between twitter and non-twitter accuracy, then filter on that
#34 opened by speediedan - 0
Retrain with temporal test split
#35 opened by speediedan - 0
- 0
Combine tweet html and graphs into image
#31 opened by speediedan - 0
- 0
- 0
Similarity calculated by minimizing a linear function of a) l2 pairwise distance between sentence embeddings b) pairwise sigmoid output delta
#30 opened by speediedan