Pinned Repositories
cca
canonical correlation analysis: routines and demos
elfcb
Empirical Likelihood for Contextual Bandits
fastapprox
Approximate and vectorized versions of common mathematical functions
hashpca
Scalable PCA via Hashing
ldlmd2016
slides and other artifacts from http://letsdiscussnips2016.weebly.com/
linrepcb
SpannerIGW for linearly representable infinite action contextual bandits
memoryrl
combining memory and rl
randembed
Randomized embeddings for extreme learning
smoothcb
Smoothed IGW for infinite action contextual bandits
xlst
eXtreme Learning Spectral Trees
pmineiro's Repositories
pmineiro/randembed
Randomized embeddings for extreme learning
pmineiro/elfcb
Empirical Likelihood for Contextual Bandits
pmineiro/fastapprox
Approximate and vectorized versions of common mathematical functions
pmineiro/hashpca
Scalable PCA via Hashing
pmineiro/cca
canonical correlation analysis: routines and demos
pmineiro/xlst
eXtreme Learning Spectral Trees
pmineiro/ldlmd2016
slides and other artifacts from http://letsdiscussnips2016.weebly.com/
pmineiro/linrepcb
SpannerIGW for linearly representable infinite action contextual bandits
pmineiro/memoryrl
combining memory and rl
pmineiro/smoothcb
Smoothed IGW for infinite action contextual bandits
pmineiro/bearsmovie
someplace to put my xtranormal video
pmineiro/cb_bakeoff
scripts for evaluation of contextual bandit algorithms
pmineiro/csrobust
Robust confidence sequences
pmineiro/vowpal_wabbit
John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm
pmineiro/aums
Alternative Universe Show
pmineiro/batch_rl
Offline Reinforcement Learning (aka Batch Reinforcement Learning) on Atari 2600 games
pmineiro/CNTK
Computational Network Toolkit (CNTK)
pmineiro/coba
Contextual bandit benchmarking
pmineiro/dopamine
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
pmineiro/DrQA
Reading Wikipedia to Answer Open-Domain Questions
pmineiro/estimators
Estimators to perform off-policy evaluation
pmineiro/grlcaffe
Caffe: a fast open framework for deep learning.
pmineiro/lampstuff
pmineiro/LLF-Bench
A benchmark for evaluating learning agents based on just language feedback
pmineiro/mwt-ds
Umbrella repository for projects related to the MWT Decision Service
pmineiro/mycaffe
me messing around with caffe
pmineiro/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
pmineiro/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
pmineiro/trajectory-transformer
Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"
pmineiro/ubuntu-ranking-dataset-creator
A script that creates train, valid and test datasets for the ranking task from Ubuntu corpus dialogs.