Group 19 Backend Documentation!

automatic evaluation (milestone three work) -

Every file not in the DiscordBot folder was code toward finetuning our custom Gemini model. The jsonl files are our datasets (after preprocessing), and most of the python files are the scripts for processing the raw csv files collected to be our dataset. They're a bit messy, since we had some data that we wanted for sure to be in the test set (ones we pulled straight from academic papers). The finetune.py file was an attempt at using the VertexAI to finetune, but in the end, we used Vertex's AutoML feature to train straight from the model playground (it was much easier).

In DiscordBot, gemini_testing.py is the script used to evaluate base Gemini on our test set and get its results. We then evaluated the results by hand. A similar process was performed for testing the finetuned Bison model, except the API was frustrating to use, so we ended up using the cloud terminal straight in GCP to run a similar script to gemini_testing.py but on the finetuned model ... so, that was super fun. The structure of bot.py for the automated detection was fairly straightforward .. send every message to perspective and send messages to the mod channel based on the severity of the results, send every message to base gemini (not the fine-tuned model, since performance was roughly equal and base gemini was cheaper) and send messages to the mod channel based on the results.

user reporting flow (milestone two work)-

this is basically all stuff from milestone two, which is already graded, but some more detailed extensions to the flow have been added since then in the same style as before.

CS 152 - Trust and Safety Engineering

Discord Bot Framework Code

This is the base framework for students to complete Milestone 2 of the CS 152 final project. Please follow the instructions to fork this repository into your repository and make all of your additions there.

Discord Bot Setup Guide

For this milestone, your group will be making your very own Discord bot. Discord bots are implemented in Python (or Javascript) - don’t stress if you haven’t written Python before! It’s a pretty readable language, so you should be able to pick it up as you go, and the TAs are always here to help.

If you’re not familiar with Discord, that’s okay! Check out this short video which overviews Discord’s features and quirks.

Joining your group channels

First, every member of the team should join the Discord server using this invite link:

https://discord.gg/K4XmF7Yr

Discord can be used in your web browser, although most people prefer the thick client apps.

For the next two milestones, you and your group will have two channels to test and develop your bot in:

group-# and group-#-mod

where # is your group’s number. We will give you and your bot a special role such that only you and the staff can see those channels; that way, everyone will have their small workspace.

To get the role for your group, click on the TA Bot user to bring up this window.

Type in: .join # where # is replaced by your group number.

If all goes according to plan, you should receive a message back saying that you have been given a role corresponding to your group number and you should see a new role on your user in the server:

Additionally, you should be able to see two new channels under one of the “Group Channels” categories:

If you accidentally join the wrong group, just message the TA Bot .leave # to have the role removed and leave those channels.

Please let Anthony Mensah (admensah@stanford.edu or admensah on Discord) know if something goes awry in this process!

[One student per group] Setting up your bot

Note: only ONE student per group should follow the rest of these steps.

Download files

Fork and clone this GitHub repository. For instructions on how to fork a GitHub repo, see this article. For your group to be able to collaborate effectively on this project, we recommend you create a shared GitHub repository; when you do, make sure you use the .gitignore file included in the starter code so that you don’t accidentally upload your tokens to GitHub. Our GitHub repository already has tokens.json in its .gitignore file. When you clone your project from there, you will have to create your own tokens.json file in the same folder as your bot.py file. The tokens.json file should look like this, replacing the “your key here” with your key. In the below sections, we explain how to obtain Discord keys.

{
	"discord": "your key here"
}

Making the bot

The first thing you’ll want to do is make the bot. To do that, log in to https://discord.com/developers and click “New Application” in the top right corner.

Name your application Group # Bot , where # is replaced with your group number. So, for instance, Group 0 would name their bot like so:

It is very important that you name your bot exactly following this scheme; some parts of the bot’s code rely on this format.

Next, you’ll want to click on the tab labeled “Bot” under “Settings.”
Click “Copy” to copy the bot’s token. If you don’t see “Copy”, hit “Reset Token” and copy the token that appears (make sure you’re the first team member to go through these steps!)
Open tokens.json and paste the token between the quotes on the line labeled “discord”.
Scroll down to a region called “Privileged Gateway Intents”
Tick the options for “Presence Intent”, “Server Members Intent”, and “Message Content Intent”, and save your changes. See the image for what it should look like.

An aside: It’s unsafe to embed API keys in your code directly. If you put that code on GitHub, then anyone could find and use that key! (GitHub actually tries to detect code like this and forbids programmers from uploading it.) That’s why we’re storing them in a separate file which can be ignored by version control software.

Next, we’ll add the bot to the 152 Discord server! You’ll need to generate a link that the teaching team can use to invite your bot.

Click on the tab labeled “OAuth2” under “Settings”
Click the tab labeled “URL Generator” under “OAuth2”.
Check the box labeled “bot”. Once you do that, another area with a bunch of options should appear lower down on the page.
Check these permissions, then copy the link that’s generated.

Send that link to any of the TAs via Discord (or by email) - they will use it to add your bot to the server. Once they do, your bot will appear in the #general channel and will be a part of the server!

Note that these permissions are just a starting point for your bot. We think they’ll cover most cases, but you may run into cases where you want to be able to do more. If you do, you’re welcome to send updated links to the teaching team to re-invite your bot with new permissions.

Setting up the starter code

First things first, the starter code is written in Python. You’ll want to make sure that you have Python 3 installed on your machine; if you don’t, follow these instructions to install PyCharm, the Stanford-recommended Python editor. Alternatively, you can use a text editor of your choice.

Once you’ve done that, open a terminal in the same folder as your bot.py file. (If you haven’t used your terminal before, check out this guide!)

You’ll need to install some libraries if you don’t have them already, namely:

# python3 -m pip install requests
# python3 -m pip install discord.py

[Optional] Setting up your own server

If you want to test out additional permissions/channels/features without having to wait for the TAs to make changes for you, you are welcome to create your own Discord server and invite your bot there instead! The starter code should support having the bot on multiple servers at once. If you do make your server, make sure to add a group-# and group-#-mod channel, as the bot’s code relies on having those channels for it to work properly. Just know that you’ll eventually need to move back into the 152 server.

Guide To The Starter Code

Next up, let’s take a look at what bot.py already does. To do this, run bot.py and leave it running in your terminal. Next, go into your team’s private group-# channel and try typing any message. You should see something like this pop up in the group-#-mod channel:

The default behavior of the bot is, that any time it sees a message (from a user), it sends that message to the moderator channel with no possible actions. This is not the final behavior you’ll want for your bot - you should update this to match your report flow. However, the infrastructure is there for your bot to automatically flag messages and (potentially) moderate them somehow.

Next up, click on your app in the right sidebar under “Online” to begin direct messaging it (or click on its name). First of all, try sending “help”. You should see a response like this (but with your group number instead of Group 0):

Try following its instructions from there by reporting a message from one of the channels to get a sense for the reporting flow that’s already built out for you. (Make sure to only report messages from channels that the bot is also in.)

If you look through the starter code, you’ll see the beginnings of the reporting flow that are already there. It will be up to you to build that out in whatever way your group decides is best. You’re welcome to edit any part of the starter code you’d like if you want to change what’s already there - we encourage it! This is just meant to be a starting point that you can pattern match off of.

If you’re not familiar with Python and asynchronous programming, please come to a section for an introduction. The TAs are happy to walk you through the starter code and explain anything that’s unclear.

Troubleshooting

`Exception: tokens.json not found`!

If you’re seeing this error, it probably means that your terminal is not open in the right folder. Make sure that it is open inside the folder that contains bot.py and tokens.json. You can check this by typing in ls and verifying that the output looks something like this:

	# ls
	bot.py 	tokens.json

`SSL: CERTIFICATE_VERIFY_FAILED error`

Discord has a slight incompatibility with Python3 on Mac. To solve this, navigate to your /Applications/Python 3.6/ folder and double click the Install Certificates.command. Try running the bot again; it should be able to connect now.

If you’re still having trouble, try running a different version of Python (i.e. use the command python3.7 or python3.8) instead. If that doesn’t work, come to section and we’ll be happy to help!

`intents has no attribute message_content error`

This is an issue with the version of Discord API that is installed. Try the following steps:

running pip install --upgrade discord in the terminal in your folder in the project that contains this file
IF that does not work, try changing the line in bot.py that says intents.message_content = True to intents.messages = True

Resources

Below are some resources we think might be useful to you for this part of the milestone.

Here is the documentation for discord.py, Discord’s Python package for writing Discord bots. It’s very thorough and fairly readable; this plus Google (in addition to the TAs) should be able to answer all of your functionality questions!

Discord bots frequently use emoji reactions as a quick way to offer users a few choices - this is especially convenient in a setting like moderation when mods may have to make potentially many consecutive choices. Check out on_raw_reaction_add() for documentation about how to do this with your bot. You also might want to look into on_raw_message_edit() to notice users editing old messages.

Discord offers “embeds” as a way of getting a little more control over message formatting. Read more about them in this article or in the official documentation.

unicode and uni2ascii-janin are two packages that can help with translating unicode characters to their ascii equivalents.

devansoliman/cs152bots-spr24-group19