techandy42

🎓 CS @ U of Waterloo | 🤖 AI Student Researcher @ WAT.ai x hamming.ai | 🏆 4x Hackathon Winner | LLM Enjoyer

Waterloo Ontario, Canada

techandy42's Stars

QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell8.6k 47 819542
Devinterview-io/llms-interview-questions
🟣 LLMs interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
152 2 020
booydar/babilong
BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
Language:Jupyter Notebook141 5 316
waterhorse1/ChessGPT
(NeurIPS 2023) ChessGPT - Bridging Policy Learning and Language Modeling
Language:Python96 4 47
Arize-ai/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Language:Python95 2 01
infinigence/LVEval
Repository of LV-Eval Benchmark
Language:Python44 2 34
HammingHQ/bug-in-the-code-stack
A new benchmark for measuring LLM's capability to detect bugs in large codebase.
Language:Jupyter Notebook232
techandy42/FinancialBERT
Stock price prediction model built using BERT and regression model trained on textual financial news data.
Language:Jupyter Notebook10 1 01
HammingHQ/hamming-examples
Various examples on how to use Hamming for evals + observability
Language:Python6 2 1
nonsequitoria/simplekit
SimpleKit
Language:TypeScript3 1 04
techandy42/awesome-llm-metrics
An open-source framework that makes evaluating LLMs & prompt engineering x10 easier!
Language:Python3 1 00
techandy42/bug_in_the_code_stack
A new benchmark for measuring LLM's capability to detect bugs in large codebase.
Language:Jupyter Notebook3 1 03
techandy42/CrafterGPT
Leveraging Language Model to Play Procedurally-Generated Survival Games.
Language:Jupyter Notebook3 1 00
techandy42/ExchangeAgent
Training a stock exchange agent with Reinforcement Learning algorithms and Decision Transformer.
Language:Jupyter Notebook3 1 01
techandy42/GreenTechGuardians
A Circular Economy business idea evaluator tool built using Gen-AI.
Language:Jupyter Notebook3 1 03
genai-genesis-2024/web-agent
Language:Python2 0 00
bing1100/hamming_m3
m3 dataset with hamming
Language:TypeScript1 1 00
paulpark6/WildFire
Language:Jupyter Notebook1 2 01
SYS-NG/Goose_Guru_HTN2024
Language:TypeScript1
techandy42/babilong
BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
Language:Jupyter Notebook1 0 0
techandy42/bug_in_the_code_stack_v2
Can LLMs find bugs that compilers can't?: A benchmark for measuring LLMs' capabilities in debugging large source code.
Language:Jupyter Notebook1 1 0
techandy42/Codegen_Challenge_Submission
A Python import visualization program.
Language:Jupyter Notebook1
techandy42/crafter
Benchmarking the Spectrum of Agent Capabilities
Language:Python1 0 0
techandy42/debugger_llm
Open-source datasets & models for LLM Judges to find and describe bugs in LLM-generated code.
Language:Jupyter Notebook1
techandy42/eccc-hail-forecasting-project
Open-source ECCC repository for notebooks and documentations for the Hail Forecasting project by Hokyung (Andy) Lee.
Language:Jupyter Notebook1 1 0
techandy42/eccc-webcam-project
Open-source ECCC repository for notebooks and documentations for the Webcam project by Hokyung (Andy) Lee.
Language:Jupyter Notebook1 1 0
techandy42/LVEval
Repository of LV-Eval Benchmark
Language:Jupyter Notebook1 0 0
techandy42/racecar_gym
A gym environment for a miniature racecar using the pybullet physics engine.
Language:Python1 0 0
techandy42/RagTagTeam
Startup co-founder matching platform built using Cohere for the WAT.AI RAG Challenge hackathon.
Language:Jupyter Notebook1 1 0
techandy42/rank_llm
Repository for prompt-decoding using LLMs (GPT3.5, GPT4, and Vicuna)
Language:Python1 0 0

techandy42

techandy42's Stars

QwenLM/Qwen2.5

Devinterview-io/llms-interview-questions

booydar/babilong

waterhorse1/ChessGPT

Arize-ai/LLMTest_NeedleInAHaystack

infinigence/LVEval

HammingHQ/bug-in-the-code-stack

techandy42/FinancialBERT

HammingHQ/hamming-examples

nonsequitoria/simplekit

techandy42/awesome-llm-metrics

techandy42/bug_in_the_code_stack

techandy42/CrafterGPT

techandy42/ExchangeAgent

techandy42/GreenTechGuardians

genai-genesis-2024/web-agent

bing1100/hamming_m3

paulpark6/WildFire

SYS-NG/Goose_Guru_HTN2024

techandy42/babilong

techandy42/bug_in_the_code_stack_v2

techandy42/Codegen_Challenge_Submission

techandy42/crafter

techandy42/debugger_llm

techandy42/eccc-hail-forecasting-project

techandy42/eccc-webcam-project

techandy42/LVEval

techandy42/racecar_gym

techandy42/RagTagTeam

techandy42/rank_llm