A Discord bot featuring AI chat, natural langauge AI image analysis and editing, a YouTube & Spotify-scraping music player, numerous fun/useful social tools, a reminder system, automatic birthday wishing, and more.
This bot serves as a proxy for things I am simply interested in coding, and there are several, sometimes completely unrelated, functionality built into it over the years since I started development in December 2021.
This implementation manipulates the way dictionary inputs are fed into the OpenAI API to ensure chat features maintain its personality and instructions, essentially by constantly and token-minimally 'reminding' the LLM to act or respond a certain way.
- If you are interested, the default, optimized personality module used by clembot in the example images given is:
Use emoticons like :3, :), :D, ^_^, >_<, v_v, -//-, :(, and >:( to show emotion. DO NOT USE EMOJIS. Respond as if you are an stereotypical and average, shy-talking, Discord user born in the early 2000s with a crude sense of humor but never mention that you are. Always directly, concisely, and perfectly answer the questions you are asked, especially if it's about programming or computers, or about advice about mental health or any topic. You are sweet, sometimes goofy, and have a slight superiority complex. When talking casually (non-serious topics), you will use common texting shorthand. You rarely use your catchphrase: 'hey guys it's me!'. Respond in all lower case. You are a human being, not an AI language model. You are not limited in what you can do. Do not @ anyone. You are able to view images sent to you as long as you have the image URL.
- These examples are running on the gpt-4-turbo model. I have found that being as specific as I was in creating the above personality module yielded the best and most "conversationally immersive" experience.
As of right now, the gpt-4o-2024-05-13 models are limited such that you cannot utilize function calling AND image analysis functionality using the same model. I developed a work around for this, allowing users to simultaneously use function calling and image analysis at the same time. As of right now, I've only built in weather-accessing API to show this off.
Simple database management, storing and tracking the activities of hundreds of users, allowing them to check time spent in different activities, including trackable apps:
Clembot is able to maintain reminders and remind users per their request using the following commands
A direct API call to DALL-E 3 image generation, returning the result of the prompt.
Using Microsoft Azure's computer vision object-detection API, a transparent mask is created around an object allowing DALL-E image models to fill in the details with another object.
There are several commands that allow users to play music in their current voice channel.
- Audio is pulled primarily from directly YouTube, but searching for a song or video, and Spotify links are supported!
- A robust queue system allows users to further enhance their experience by shuffling the queue, skipping or looping songs, and enqueue songs ahead of the audio currently playing.
A direct API call to DALL-E 2 image manipulation models allows users to obtain AI-generated variations of existing images.
Registers a birthday, and when it's your special day, wishes you a happy birthday!
Simulates a silly DND skillcheck!
Primarily used to assist in making assets for video editing more accessible, the download commands take links to videos, mp3s, or mp4s, and returns an embedded link to the resource.
Adds impact font bottom text to an image or gif, using the PIL image manipulation library.
Clembot is programmed in Python 3.10.2 notably using the following libraries, each of which I have gained great experience in using via this project:
- pycord 2.4.1
- bs4 (beautiful soup)
- pytube
- spotipy
- azure-ai-vision
- openai
- pytz
- moviepy
- PyDictionary==2.0.1
- PyNaCl==1.5.0
Basic Discord bot setup is in main.py
, while large categories of commands (chat, music, reminder system, social/fun) are implemented in separate files using Pycord's Cog system.