Pinned Repositories
3d_masks
3d masks using three.js and facemesh by tensorflow.js
AI-QMIX
Code for "AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning"
awesome-llm-rl-agents
List of sources related to llms, transformers and reinforcement learning agents
awesome-ml-cybersecurity
reasoning-lib
researchim
rllib.js
Reinforcement learning library with JavaScript.
tfjs-gans
There are collections of GANs made using tfjs and THREE.js
VK_NEXT_CHAT
3D Web chat, using Three.js + Peer.js + Node.js
webrtc_chat_3d_engine
There is Engine for creating web 3d chats. It is made using WebRTC (Peer.js) + WebGL (Three.js).
tokarev-i-v's Repositories
tokarev-i-v/researchim
tokarev-i-v/awesome-ml-cybersecurity
tokarev-i-v/algo
tokarev-i-v/cultural-accumulation
tokarev-i-v/dart-math
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
tokarev-i-v/DeepCubeAI
Learning Discrete World Models for Heuristic Search
tokarev-i-v/grokfast
Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
tokarev-i-v/GrokkedTransformer
Code for the paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
tokarev-i-v/Grounding_LLMs_with_online_RL
We perform functional grounding of LLMs' knowledge in BabyAI-Text
tokarev-i-v/LAPO
Code for the ICLR 2024 spotlight paper: "Learning to Act without Actions" (introducing Latent Action Policies)
tokarev-i-v/LightZero
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
tokarev-i-v/LLaMA-O1
Large Reasoning Models
tokarev-i-v/llm.c
LLM training in simple, raw C/CUDA
tokarev-i-v/loss-of-plasticity
Demonstrations of Loss of Plasticity and Implementation of Continual Backpropagation
tokarev-i-v/MacroHFT
tokarev-i-v/MathBlackBox
tokarev-i-v/mctslib
tokarev-i-v/models-at-home
tokarev-i-v/open-oasis
Inference script for Oasis 500M
tokarev-i-v/openr
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
tokarev-i-v/quiet-star
Code for Quiet-STaR
tokarev-i-v/RethinkMCTS
tokarev-i-v/rStar
tokarev-i-v/ScaleQuest
We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.
tokarev-i-v/search-agents
Code for the paper 🌳 Tree Search for Language Model Agents
tokarev-i-v/SelfCorrectionLanguageModelTraining
tokarev-i-v/Super_MARIO
tokarev-i-v/TheArtofHPC_pdfs
All pdfs of Victor Eijkhout's Art of HPC books and courses
tokarev-i-v/torax
TORAX: Tokamak transport simulation in JAX
tokarev-i-v/uvadlc_notebooks
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2022/Spring 2022