reward
There are 115 repositories under reward topic.
tigerneil/awesome-deep-rl
For deep RL and the future of AI.
aleju/mario-ai
Playing Mario with Deep Reinforcement Learning
yanm1ng/hexo-theme-vexo
🍟 Vexo is a Hexo theme inspired by Vue's official website.
henry-fun/hanshan-lottery
An amazing lottery app created for the world
drallgood/jpasskit
jPasskit is an Java™ implementation of the Apple™ PassKit Web Service.
ecency/ecency-mobile
Ecency Mobile - reimagined social blogging, contribute and get rewarded (for Android and iOS)
Prem-ium/BingRewards
🤖 Automate Bing Searches 🔍, Quizzes 🧪, Polls 📝, & more for Bing Rewards. 💸
Miraclemarvel55/ChatGLM-RLHF
对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF
alison-carrera/mabalgs
:bust_in_silhouette: Multi-Armed Bandit Algorithms Library (MAB) :cop:
WFCD/warframe-drop-data
:moneybag: Warframe Drop Data in an easier to parse format.
NiuTrans/Vision-LLM-Alignment
This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.
bulwark-crypto/Bulwark
The primary development repository for the Bulwark project
iGoodie/TwitchSpawn
👾 TwitchSpawn is a Minecraft mod, which is designed for Twitch streamers using 3rd party streaming tools! (comes with its own language!)
powerpool-finance/powerindex
📈📉Power Index is an ecosystem product of PowerPool. The main feature of Power Index is a possibility to create special pools with unique governance and design.
ihoey/Playing-reward
超好看的打赏功能~ 演示地址
khinthandarkyaw98/Optimizing-UAV-trajectory-for-maximum-data-rate-via-Q-Learning
During our participation in the Internship Exchange Program, my friend and I collaborated with the guidance of our esteemed supervisor from NTHU.
anarkrypto/P2PoW
A P2P Delegated Proof of Work solution for Nano cryptocurrency
ssbuild/chatglm_rlhf
chatglm_rlhf_finetuning
ssbuild/llm_rlhf
realize the reinforcement learning training for gpt2 llama bloom and so on llm model
dp770/aws_deepracer_worksheet
Worksheet and Utilities for AWS DeepRacer – one of the most exciting ways of building strong skills in reinforcement learning and through a hands-on approach. This repository offers: 1) Functionally-rich and flexible reward function 2) Utilities with Jupiter notes for Racing Line calculation and visualisation of track 3) Scripts to parse RoboMaker training and evaluation logs to CSV file 4) Sample Excel file for car behaviour analysis as well as designing and planning new reward curves 5) Coordinates and AWS DeepRacer tracks and images.
Miraclemarvel55/LLaMA-MOSS-RLHF-LoRA
用RLHF可选LoRA对LLaMA和MOSS进行训练|Training LLaMA or MOSS with RLHF [LoRA]
netcoinfoundation/netcoin
Netcoin - Digital currency with personal interest rate and fair weight stake mining
piconnectdev/wepi
🐀 Building a federated alternative to reddit in rust
2008Choco/DragonEggDrop
Spigot plugin. Overhaul the dragons summoned in The End. Configurable templates, loot and particles. (Modern fork of PixelStix's DragonEggDrop)
aaksham/frozenlake
Value & Policy Iteration for the frozenlake environment of OpenAI
HANDZCZ/genshin-stats
Repository that host code to show my genshin stats. Claims daily reward and active primo codes.
winston1017/Platformer
2D Platformer game for Android with automatic randomizing level generator! Currently set at 10 levels but 2 lines of code and you will have another.
Arcadier/Discount-Coupon-Generator
:gift: Create and reward your consumers with discount coupons to boost sales and build a loyal user base.
corbosiny/AIVO-StreetFigherReinforcementLearning
Creating an environment to quickly train a variety of Deep Reinforcement Learning algorithms on Street Fighter 2 using tournaments between learning agents
denvash/jesta-android-app
Jesta 💎 is a social app where people can do favors for each other, in exchange for rewards. 🤝
mjwpl/LoopLords
Loop Lords is an application designed to help users manage their recurring tasks efficiently. It aims to remind users of their cyclical tasks before the deadline, reward them for completing tasks within the cycle, prioritize tasks based on their last completion date (e.g., diet), and assist users in breaking habits (e.g., computer gaming addiction)
my-cloud/ruthenium
Golang implementation of the Ruthenium protocol
NPW-Project/NewPowerCoin
New Power Coin - A new masternode-enabled cryptocurrency that drives online traffic into a new era of decentralization.
CarsonScott/Dual-Process-Reinforcement
An intelligent agent that adaptively changes its thought processes to maximize cumulative reward
citizenweb3/staking
Non custodial staking service for web3