/RTutor

Chat with your data via AI. https://RTutor.ai

Primary LanguageHTMLOtherNOASSERTION

RTutor.ai - Talk to your data via AI

Hosted at RTutor.ai. Contact Steven Ge on Twitter

RTutor is an AI-based app that can quickly generate and test R code. Powered by API calls to OpenAI's Davinci (ChatGPT's sibling), RTutor translates natural languages into R scripts, which are then executed within the Shiny platform. An R Markdown source file and HTML report can be generated.

Installation

This repository is updated frequently, sometimes a few times a day. We suggest users reinstall everytime before using it, so that you always have the most recent version.

  1. Update R and RStudio to the most recent version.
  2. Install RTutor package
if (!require("remotes")) {
  install.packages("remotes")
}
library(remotes)
#voice input package heyshiny
install_github("jcrodriguez1989/heyshiny", dependencies = TRUE)
install_github("gexijin/RTutor")
  1. Install other R packages. If you want to use additional R package for analyzing your data, you should install these in your computer too.

Obtain an API key from OpenAI

  1. Create a personal account at OpenAI.
  2. After logging in, click on Personal from top left.
  3. Click Manage Account and then Billing, where you can add Payment methods and set Usage limits. $3-$5 per month is more than enough for most people.
  4. Click on API keys to create a new key, which can be copied.

Use the API key with RTutor

There are several ways to do this.

  • After the app is started, you can click on Settings and paste the API key.
  • You can also save this key as a text file called api_key.txt in the working directory.
  • Finally, you can create an environment variable called OPEN_API_KEY. Instructions for Windows, Mac, and Linux.

To start RTutor

library(RTutor)
run_app()

License

(CC BY-NC 3.0) Non-commercial use.

Examples

See this report generated by RTutor after in a typical session.

"Use the mpg data frame. Use ggplot2 to create a boxplot of hwy vs. class. Color by class."

library(ggplot2)
ggplot(mpg, aes(x=class, y=hwy, fill=class)) +
  geom_boxplot()

"Use the mpg data frame. Conduct ANOVA of log-transformed hwy by class and drv."

#Load necessary libraries
library(tidyverse)
#Create log-transformed hwy variable
data <- mpg %>% 
  mutate(hwyLog = log(hwy))
#Perform ANOVA
modelResults <- aov(hwyLog ~ class * drv, data = data)
#View output
summary(modelResults)

"Use the mpg data frame. Create a correlation map of all the columns that contain numbers."

cor_data <- cor(select_if(mpg, is.numeric))
library(corrplot) 
corrplot(cor_data)

"Use the mpg data frame. hwy and cty represent miles per gallon (MPG) on the highway and in the city, respectively. Only keep cars more efficient than 15 MPG, but less than 40, on the highway. Add 0.5 to city MPG for correction. Perform log transformation on city MPG. Raise highway MPG to the second power. Calculate correlation coefficient of the two transformed variables."

# filter for cars more efficient than 15 on the highway, but less than 40
mpg_filtered <- mpg %>% 
    filter(hwy > 15 & hwy < 40)
# add 0.5 to city MPG
mpg_filtered$cty <- mpg_filtered$cty + 0.5
# perform log transformation on city MPG
mpg_filtered$cty <- log(mpg_filtered$cty)
# raise highway MPG to the second power
mpg_filtered$hwy2 <- mpg_filtered$hwy^2
# calculate correlation coefficient
cor(mpg_filtered$hwy2, mpg_filtered$cty)

Alternative solution:

library(tidyverse)
mpg %>%
  filter(hwy > 15 & hwy < 40) %>%
  mutate(cty = cty + 0.5,
         cty = log(cty),
         hwy = hwy^2) %>%
  summarise(corr = cor(cty, hwy))