holarissun's Stars
floodsung/LLM-with-RL-papers
A collection of LLM with RL papers
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
wordweb/Tiger-qq-bot
基于langchain-chatglm-and-tigerbot+mirai 实现的一个qq群本地知识库问答机器人,可以通过直接将知识库文件提交到qq群的方式来上传知识库,也可以通过指令来开关(删除)知识库。从而得到一个基于qq的便携式本地知识库问答机器人。
wordweb/langchain-ChatGLM-and-TigerBot
从langchain-ChatGLM基础上修改的一个可以加载TigerBot模型的基于本地知识库的问答应用,目标期望建立一套对中文场景与开源模型支持友好、可离线运行的知识库问答解决方案。
liaokongVFX/LangChain-Chinese-Getting-Started-Guide
LangChain 的中文入门教程
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
Alvin9999/new-pac
翻墙-科学上网、自由上网、免费科学上网、免费翻墙、油管youtube、fanqiang、软件、VPN、一键翻墙浏览器,vps一键搭建翻墙服务器脚本/教程,免费shadowsocks/ss/ssr/v2ray/goflyway账号/节点,翻墙梯子,电脑、手机、iOS、安卓、windows、Mac、Linux、路由器翻墙、科学上网、youtube视频下载、美区apple id共享账号
vanderschaarlab/clairvoyance
Clairvoyance: a Unified, End-to-End AutoML Pipeline for Medical Time Series
jxx123/simglucose
A Type-1 Diabetes simulator implemented in Python for Reinforcement Learning purpose
dickreuter/neuron_poker
Texas holdem OpenAi gym poker environment with reinforcement learning based on keras-rl. Includes virtual rendering and montecarlo for equity calculation.
denisyarats/exorl
ExORL: Exploratory Data for Offline Reinforcement Learning
lizhuo-1994/NECSA
Official implementation of Neural Episodic Control with State Abstraction
rll-research/url_benchmark
cloneofsimo/lora
Using Low-rank adaptation to quickly fine-tune diffusion models.
tinkoff-ai/CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
holarissun/DOMIAS
vanderschaarlab/synthetic-data-lab
A repository containing the materials required to complete the "AAAI Lab for Innovative Uses of Synthetic Data". This includes tutorials on how to use the library "Synthcity" for improving the fairness and privacy of a dataset as well as for augmenting a small dataset using some other similar datasets.
Trinkle23897/tuixue.online-visa
https://tuixue.online/visa/ A Real-time Display of U.S. Visa Appointment Status Website 预约美帝签证各个签证处最早时间的爬虫
alihanhyk/invconban
Inverse Contextual Bandits: Learning How Behavior Evolves over Time
HITFRobot/happy-spiders
🔧 🔩 🔨 收集整理了爬虫相关的工具、模拟登陆技术、代理IP、scrapy模板代码等内容。
AminHP/gym-mtsim
A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)
sail-sg/envpool
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
banditml/offline-policy-evaluation
Implementations and examples of common offline policy evaluation methods in Python.
holarissun/RewardShifting
Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL
google-research/deep_ope
clvoloshin/COBS
OPE Tools based on Empirical Study of Off Policy Policy Estimation paper.
st-tech/zr-obp
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
holarissun/MOPA
academicpages/academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
metadriverse/metadrive
MetaDrive: Open-source driving simulator