Pinned Repositories
anchor
Code for "High-Precision Model-Agnostic Explanations" paper
ExplanationRoles
Code for paper "When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data"
ExplanationSearch
Code for paper "Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals"
interpretable-image
Code for "Interpretable Image Recognition with Hierarchical Prototypes"
InterpretableNLP-ACL2020
Code for "Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?"
LAS-NL-Explanations
Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"
LLM-belief-revision
mechanistic-interpretability
poetry-generation
Code for "Shall I Compare Thee to a Machine-Written Sonnet? An Algorithmic Approach to Sonnet Generation", available at https://arxiv.org/abs/1811.05067
SLAG-Belief-Updating
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"
peterbhase's Repositories
peterbhase/InterpretableNLP-ACL2020
Code for "Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?"
peterbhase/SLAG-Belief-Updating
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"
peterbhase/LAS-NL-Explanations
Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"
peterbhase/interpretable-image
Code for "Interpretable Image Recognition with Hierarchical Prototypes"
peterbhase/ExplanationSearch
Code for paper "Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals"
peterbhase/ExplanationRoles
Code for paper "When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data"
peterbhase/poetry-generation
Code for "Shall I Compare Thee to a Machine-Written Sonnet? An Algorithmic Approach to Sonnet Generation", available at https://arxiv.org/abs/1811.05067
peterbhase/LLM-belief-revision
peterbhase/mechanistic-interpretability
peterbhase/anchor
Code for "High-Precision Model-Agnostic Explanations" paper
peterbhase/evolution-strategies-exploration
Contains implementation of: Tim Salimans Et al. “Evolution Strategies as a Scalable Alternative to Reinforcement Learning”. Arxiv.org. https://arxiv.org/pdf/1703.03864.pdf.
peterbhase/peterbhase.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
peterbhase/rome
Rank-One Model Editing for Locating and Editing Factual Knowledge in GPT
peterbhase/tennis_wta
WTA Tennis Rankings, Results, and Stats
peterbhase/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.