Pinned Repositories
auto_dataset_card
Wouldn't it be nice to generate parts of our dataset card automagically?
awesome-synthetic-datasets
awesome synthetic (text) datasets
blog
data-for-fine-tuning-llms
fastai4GLAMS
A study group for v4 of the fastai introduction to deep learning course with a focus on applications in GLAM settings
flyswot
Command Line Interface for running 🤗 Transformers Image Classification locally
haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
huggingface-tldr
Experimental tl;dr summaries for datasets on the Hugging Face Hub!
LLM-pubmed-query-generation-evaluation
LLM PubMed Query Generation Evaluation
data-is-better-together
Let's build better datasets, together!
davanstrien's Repositories
davanstrien/davanstrien.github.com
Daniel van Strien's personal blog
davanstrien/uk-web-archive-open-data-wellcome-project
Experimenting with web archive data
davanstrien/2019-06-25-UCD
davanstrien/course-v3
The 3rd edition of course.fast.ai
davanstrien/disease-ner-moh
davanstrien/DMPonline-ucl
davanstrien/fastai2
Temporary home for fastai v2 while it's being developed
davanstrien/fastpages
An easy to use blogging platform, with enhanced support for Jupyter Notebooks.
davanstrien/image_slicer
Split images into tiles. Join the tiles back together.
davanstrien/Introduction-to-Digital-Scholarship-and-Open-Research
Workshop materials for a course introducing digital scholarship and open research
davanstrien/lc-shell
Library Carpentry: The UNIX Shell
davanstrien/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks
Repository for code underlying the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks'
davanstrien/nbdev
Create delightful python projects using Jupyter Notebooks
davanstrien/nlpwithpytorchbook
NLP with pytorch notes/notebooks
davanstrien/notebook
Jupyter Interactive Notebook
davanstrien/open-access-funding-policies
davanstrien/pandoc-letter
Pandoc template for writing letters in markdown
davanstrien/pandoc-templates
Templates for pandoc
davanstrien/pyrdm
PyRDM is a Python-based library for research data management (RDM). It facilitates the automated publication of scientific software and associated input and output data.
davanstrien/python-jumpstart-course-demos
Contains all the "handout" materials for my Python Jumpstart by Building 10 Apps course. This includes try it yourself and finished versions of the 10 apps.
davanstrien/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
davanstrien/PyTorchNLPBook
Code and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://nlproc.info
davanstrien/readthedocs.org
source code to readthedocs.org
davanstrien/recipes
A place for all my recipes to live. Mostly vegan. Usually tasty. Creative Commons licensed.
davanstrien/roadmap
DCC/UC3 collaboration for a data management planning tool
davanstrien/study-group-orientation
Gitbook for onboarding lessons, materials, and activities
davanstrien/TidyTextMining-Python
Code snippets from "Text Mining with R: A Tidy Approach" translated into Python
davanstrien/toolz
A functional standard library for Python.
davanstrien/ucl-research-data-management
davanstrien/warcbase
Warcbase is an open-source platform for managing analyzing web archives