Chicago/RSocrata

return: text/plain not a supported file format, I am not sure if there is syntax error or what

LoveYukee opened this issue · 9 comments

library(dplyr)
#library(ggplot2)
#library(leaflet)
library(soql)
library(RSocrata)
library(magrittr)

api_url <- "https://openpaymentsdata.cms.gov/resource/tvyk-kca8.csv"

query <- soql() %>%
soql_add_endpoint(api_url) %>%
soql_limit(2000) %>%
soql_select(paste(
"teaching_hospital_name",
"physician_primary_type",
"physician_specialty",
"recipient_zip_code",
"total_amount_of_payment_usdollars",
"submitting_applicable_manufacturer_or_applicable_gpo_name",
"date_of_payment",
"recipient_state",
sep = ","
)) %>%
soql_order("date_of_payment", desc=TRUE)

api_url2004 <- "https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv$$app_token = Wukqis8FG8q4nscOFapsQrm1V"
query1 <- soql() %>%
soql_add_endpoint(api_url2004) %>%
soql_limit(2000)
data <- read.socrata(query1,"Wukqis8FG8q4nscOFapsQrm1V")

stringQuery = "physician_last_name LIKE 'NAZARIAN'"
api_url <- "https://openpaymentsdata.cms.gov/resource/ak56-dpcz.csv$$Wukqis8FG8q4nscOFapsQrm1V"
stringQuery = "physician_last_name LIKE 'Nazarian' AND UPPER(physician_first_name) LIKE 'SAMAN'"
stringQuery1 = "upper(physician_last_name) ='BERGER' AND upper(physician_first_name) ='RONALD'
AND starts_with(UPPER(physician_middle_name),'D')"
query1 = soql() %>%
soql_add_endpoint(api_url) %>%
soql_select("Physician_Profile_ID") %>%
soql_where(stringQuery1) %>%
soql_order("date_of_payment", desc=TRUE) %>%
as.character()
data <- read.socrata(query1,"Wukqis8FG8q4nscOFapsQrm1V")

Hi, thanks for you suggestion, but it is the first the time that I used this package, could you please explain a little bit? Thank you so much

Thank you, I will try

Not sure what all is going on with your code...

First, it appears that query is not used anywhere, so the reproducible part of your code is just this:

library(dplyr)
library(soql)
library(RSocrata)
api_url2004 <- "https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv$$app_token = Wukqis8FG8q4nscOFapsQrm1V"
query1 <- soql() %>%
    soql_add_endpoint(api_url2004) %>%
    soql_limit(2000)
data <- read.socrata(query1,"Wukqis8FG8q4nscOFapsQrm1V")

If you look at query1, I think it's malformed (missing a ?). I think this https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv$$app_token = Wukqis8FG8q4nscOFapsQrm1V?$limit=2000 should be this https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv?$$app_token = Wukqis8FG8q4nscOFapsQrm1V&$limit=2000

However, I can't test it because I can't access the resource without authentication.

I would take out the token, and the limit and do this:

data <- read.socrata(url = "https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv",
                     email = "you@example.com",
                     password = "yourpassword",
                     app_token = NULL,
                     stringsAsFactors = FALSE)

You don't need the last two lines, but when I'm troubleshooting I like to put everything in the function, named, just to be careful.

If you don't want to put your password in plain text in your script, I would suggest using something like this:

data <- read.socrata(url = "https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv",
                     email = yaml::read_yaml("credentials.yaml")$email,
                     password = yaml::read_yaml("credentials.yaml")$password,
                     app_token = NULL,
                     stringsAsFactors = FALSE)

Where the credential file has your email and password. There are probably fancier solutions, but at least you don't need to commit your credentials.

I thought you were experiencing a different issue. If you are doing an aggregation then read.socrata can have a problem. I just documented that with #173

hi, actually I tried in both ways but all failed, the bug in title did not occur but get a 403 error
`> api_url2004 <- "https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv?$$app_token = qkq8v5ZBNzxC5IjatPbYSeCm1"

query <- soql() %>%

  • soql_add_endpoint(api_url2004) %>%
  • soql_limit(2000)

token <- "qkq8v5ZBNzxC5IjatPbYSeCm1"
data <- read.socrata(query, token)
2019-05-27 18:15:59.204 getResponse: Error in httr GET: 403 https://openpaymentsdata.cms.gov/resource/gysc-m9qm.csv?%24%24app_token%20=%20qkq8v5ZBNzxC5IjatPbYSeCm1%3F%24limit%3D2000&%24%24app_token=qkq8v5ZBNzxC5IjatPbYSeCm1&$order=:id You must be logged in to access this resource
Error in getResponse(validUrl, email, password) : Forbidden (HTTP 403).`

I am not sure if my app token created in RSocrata is valid right now because when I copy paste this url, I could not access the data.

I am pretty sure that the query needs to be included

I am sorry but the code in last row is data <- read.socrata(query, app_token = token)
but it still returns the same error (forbidden 403)

@LoveYukee - I think this may be the same issue, but will refer you over to my StackOverflow response since others may also find it.

Going to close the issue for now since it seems to be related to a private data set. Feel free to respond over in StackOverflow. We can reopen this issue if there is a bug with the package.

Thanks!