Pinned Repositories
ai-msgbot
Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.
BoulderAreaDetector
An app that uses a CNN to classify whether a satellite image shows an area would be a good rock climbing spot or not. On streamlit.
confectionary
a tool to quickly create sweet PDF files from text files :cupcake:
textsum
CLI & Python API to easily summarize text-based files with transformers
vid2cleantxt
Python API & command-line tool to easily transcribe speech-based video files into clean text
pszemraj's Repositories
pszemraj/vid2cleantxt
Python API & command-line tool to easily transcribe speech-based video files into clean text
pszemraj/textsum
CLI & Python API to easily summarize text-based files with transformers
pszemraj/ai-msgbot
Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.
pszemraj/BoulderAreaDetector
An app that uses a CNN to classify whether a satellite image shows an area would be a good rock climbing spot or not. On streamlit.
pszemraj/confectionary
a tool to quickly create sweet PDF files from text files :cupcake:
pszemraj/lm-api
Efficiently query multiple prompts with ease: a command-line tool for batch querying large language models.
pszemraj/ml4hc-s22-project01
An investigation into tabular classification with deep NNs for ETHZ Machine Learning for Healthcare on the MIT-BIH arrythmia dataset .
pszemraj/scrape-viz-jobs
A tool for scraping and clustering job postings from ch.indeed.com; Visualization is completed through various clustering and dimensionality reduction techniques.
pszemraj/PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
pszemraj/pubmed-text-classification
ETHZ Machine Learning for Healthcare Problem 2: classification of pubmed paper sentences or text into document sections.
pszemraj/rpunct-cpu
📝An easy-to-use package to restore punctuation of the text + cpu
pszemraj/Slack-Export-JSON-to-CSV
Convert Slack messages exported in their complicated JSON format to simple CSV format, by channel or entire exported workspace
pszemraj/SummComparer
compiles and parses the summarization gauntlet and results from various models into a dataset-like format
pszemraj/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
pszemraj/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
pszemraj/autolabel
Label, clean and enrich text datasets with LLMs.
pszemraj/DailyDialogue-Parser
Parser for DailyDialogue Dataset, updated with some conventions and additional cleaning for text-generation
pszemraj/deepcluster
Custom PyTorch model (VGG-16 Auto-Encoder) and custom criterion (Local Aggregation) for image clustering. The repo contains elaborated creation of fungi image data using factory method.
pszemraj/fine-tune-fuyu
pszemraj/inbox_cleaner
A python script to help manage a Gmail inbox by filtering out promotional emails using GPT-3 or GPT-4.
pszemraj/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
pszemraj/lm-evaluation-harness
A framework for few-shot evaluation of language models.
pszemraj/megalodon
Reference implementation of Megalodon 7B model
pszemraj/mteb
MTEB: Massive Text Embedding Benchmark
pszemraj/nanoT5
Fast & Simple repository for pre-training and fine-tuning T5-style models
pszemraj/nGPT-pytorch
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
pszemraj/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
pszemraj/samba-pytorch
Minimal implementation of Samba by Microsoft in PyTorch
pszemraj/T2I_CL
pszemraj/unlimiformer
Public repo for the preprint "Unlimiformer: Long-Range Transformers with Unlimited Length Input"