ropensci-archive/AntWeb

Sometimes it looks for 'acd'

zagloj opened this issue · 15 comments

For example with this code (the original error is in spanish but its only 'acd' not found), it happens too with other genera (cataglyphis, polyergus...):

anochetus_df <- aw_data(genus="anochetus")
aw_map(anochetus_df)
Error en which(names(acd) == "decimal_latitude") :
objeto 'acd' no encontrado

At a first glance I see the call of acd, but I can't see where is created in the function.

Here is my sessionInfo, just in case:
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_ES.UTF-8 LC_COLLATE=C
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
[7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] AntWeb_0.5.3

loaded via a namespace (and not attached):
[1] RCurl_1.95-4.1 assertthat_0.1 digest_0.6.4 httr_0.2 leafletR_0.1-1
[6] plyr_1.8 rjson_0.2.13 stringr_0.6.2 tools_3.0.2

You're absolutely right @zagloj
I've just fixed the bug. Please install again (now 0.5.4) and it should work.

To reinstall

library(devtools)
install_github("ropensci/AntWeb")
anochetus_df <- aw_data(genus="anochetus")
aw_map(anochetus_df)

Keep in mind though that this is a large request and takes some time to write out the geoJSON. The output file is 18 mb and will take a long time to render on the browser. Hopefully with API improvements we can speed this up considerably.

Looking into more improvements now. I should be able to write out a leaner dataset so the map can be rendered faster. Will update in a few hours.

@zagloj
I just noticed another issue that would be hard for me to accommodate in the map as is.

 unique(species_data$scientific_name)
  [1] "anochetus (indet)"          
  [2] "anochetus afr01"            
  [3] "anochetus afr02"            
  [4] "anochetus afr03"            
  [5] "anochetus afr04"            
  [6] "anochetus afr05"            
  [7] "anochetus afr06"            
  [8] "anochetus afr07"            
  [9] "anochetus afr08"            
 [10] "anochetus afr09"            
 [11] "anochetus afrc-tz-01"       
 [12] "anochetus afrc-tz-02"       
 [13] "anochetus afrc-tz-03"       
 [14] "anochetus afrc-tz-04"       
 [15] "anochetus afrc-tz-05"       
 [16] "anochetus afrc-tz-06"       
 [17] "anochetus africanus"        
 [18] "anochetus agilis"           
 [19] "anochetus alae"             
 [20] "anochetus altisquamis"      
 [21] "anochetus anja_sp1"         
 [22] "anochetus armstrongi"       
 [23] "anochetus avius"            
 [24] "anochetus bequaerti"        
 [25] "anochetus bispinosus"       
 [26] "anochetus boltoni"          
 [27] "anochetus bp01"             
 [28] "anochetus bp02"             
 [29] "anochetus cato"             
 [30] "anochetus chirichinii"      
 [31] "anochetus cm02"             
 [32] "anochetus cryptus"          
 [33] "anochetus diegensis"        
 [34] "anochetus emarginatus"      
 [35] "anochetus emarginatus_group"
 [36] "anochetus faurei"           
 [37] "anochetus fricatus"         
 [38] "anochetus fuliginosus"      
 [39] "anochetus ga01"             
 [40] "anochetus ga09"             
 [41] "anochetus ghilianii"        
 [42] "anochetus ghilianii_group"  
 [43] "anochetus goodmani"         
 [44] "anochetus graeffei"         
 [45] "anochetus graeffei_nr"      
 [46] "anochetus grandidieri"      
 [47] "anochetus horridus"         
 [48] "anochetus incultus"         
 [49] "anochetus indet"            
 [50] "anochetus inermis"          
 [51] "anochetus isolatus"         
 [52] "anochetus jcm01"            
 [53] "anochetus jcm02"            
 [54] "anochetus jlrl-nard"        
 [55] "anochetus jtl001"           
 [56] "anochetus jtl002"           
 [57] "anochetus jtlborneo01"      
 [58] "anochetus jtlborneo02"      
 [59] "anochetus jtlborneo03"      
 [60] "anochetus katonae"          
 [61] "anochetus levaillanti"      
 [62] "anochetus longifossatus"    
 [63] "anochetus madagascarensis"  
 [64] "anochetus madaraszi"        
 [65] "anochetus maynei"           
 [66] "anochetus mayri"            
 [67] "anochetus mgm01"            
 [68] "anochetus mgm02"            
 [69] "anochetus mgm03"            
 [70] "anochetus mgm04"            
 [71] "anochetus mgm05"            
 [72] "anochetus micans"           
 [73] "anochetus minans"           
 [74] "anochetus mk02_pubescens_nr"
 [75] "anochetus modicus"          
 [76] "anochetus my01"             
 [77] "anochetus my02"             
 [78] "anochetus my03"             
 [79] "anochetus my04"             
 [80] "anochetus my05"             
 [81] "anochetus myops"            
 [82] "anochetus natalensis"       
 [83] "anochetus neglectus"        
 [84] "anochetus nietneri"         
 [85] "anochetus obscuratus"       
 [86] "anochetus orchidicola"      
 [87] "anochetus paripungens"      
 [88] "anochetus pattersoni"       
 [89] "anochetus pellucidus"       
 [90] "anochetus peracer"          
 [91] "anochetus ph01"             
 [92] "anochetus ph02"             
 [93] "anochetus ph03"             
 [94] "anochetus ph04"             
 [95] "anochetus ph_m01"           
 [96] "anochetus princeps"         
 [97] "anochetus pubescens"        
 [98] "anochetus punctaticeps"     
 [99] "anochetus rectangularis"    
[100] "anochetus renatae"          
[101] "anochetus risii"            
[102] "anochetus rufolatus"        
[103] "anochetus rufostenus"       
[104] "anochetus rugosus"          
[105] "anochetus sc01"             
[106] "anochetus sedilloti"        
[107] "anochetus simoni"           
[108] "anochetus splendidulus"     
[109] "anochetus sp_nr_traegaordhi"
[110] "anochetus striatulus"       
[111] "anochetus targionii"        
[112] "anochetus torotorofotsy01"  
[113] "anochetus traegaordhi"      
[114] "anochetus tua"              
[115] "anochetus turneri"          
[116] "anochetus validus"          
[117] "anochetus variegatus"       
[118] "anochetus veronicae"        
[119] "anochetus victoriae"        
[120] "anochetus wiesiae"          
[121] "anochetus yerburyi"         
[122] "anochetus yt01"             
[123] "anochetus za01"

So as you can see, it's impossible to generate a legend for that many species. You'll have to recode the data after the data retrieval step and reduce this number before passing it to the map function. I'll add a small step to stop the map generation if it exceeds more than dozen species.

ok, I just pushed another patch (now version is 0.5.5).

If you run the call again, you'll see this behavior:

> library(AntWeb)
> anochetus_df <- aw_data(genus="anochetus")
> aw_map(anochetus_df)
Error in aw_map(anochetus_df) : 
  Map cannot accommodate more than 30 species. Please map a smaller subset

Thanks! It works great now, Camponotus cruentatus map done, happy afternoon here, I will report this tool on some formicidae site, and the errors are easier to understand now (species limits, etc) 👍

Hi @zagloj,

I wanted to give you a heads up that all the above issues are now resolved in the latest development version of the AntWeb package (0.5.12) along with more search features. If you know how to install directly from GitHub, please do so (see instructions on the README).

Otherwise, download this source package then go into R, Packages & Data, Package Installer (then choose local source from the dropdown) and point it to this file wherever you have downloaded it. Should work then.

Even big genera like Crematogaster can easily be queried (and spread out over a few requests). There is now a limit and offset argument. You can also now query by elevation range, habitat, type, date range etc. Any comments or feedback is most welcome before I push these changes over to the new release on CRAN.

Thanks for your support.

Hi, a big thanks, installed through github (devtools) and it looks great, just looked at Crematogaster map (limited to 1000 points) and it
seems to work fine. I followed closely the development these last days,
but very busy to comment or contribute.

Anyways, I left a mention to this package on lamarabunta.org (a site
about formicidae)

Thanks again for your time, hope to bring a hand when I have some more
time.

El Tue, 11 Mar 2014 11:33:29 -0700
Karthik Ram notifications@github.com escribió:

Hi @zagloj,

I wanted to give you a heads up that all the above issues are now
resolved in the latest development version of the AntWeb package
(0.5.12) along with more search features. If you know how to
install directly from GitHub, please do so (see instructions on the
README).

Otherwise, download this source
package

then go into R, Packages & Data, Package Installer (then choose local
source from the dropdown) and point it to this file wherever you have
downloaded it. Should work then.

Even big genera like Crematogaster can easily be queried (and spread
out over a few requests). There is now a limit and offset
argument. You can also now query by elevation range, habitat, type,
date range etc. Any comments or feedback is most welcome before I
push these changes over to the new release on CRAN.

Thanks for your support.


Reply to this email directly or view it on GitHub:
#11 (comment)

Awesome @zagloj,
No worries about being busy -- totally understand. I would love input from you at a later time and would be happy to have you as a collaborator.

A quick comment re Crematogaster (or other large genera). It retrieves only 1000 results the first time but it's super easy to combine the rest of the results. I just added this feature in 0.5.13
So below, you get 1000 results the first time and 1000 more a second time, then combine the two into one result. You can easily map this large request.

x1 <- aw_data(genus = "crematogaster", georeferenced = TRUE)
x2 <- aw_data(genus = "crematogaster", georeferenced = TRUE, offset = 1000)
x12 <- aw_cbind(list(x1, x2))
aw_map(x12)

Hi, it would be great to do that automatically, since manually creating
12 variables could be tedious, I made an example with some variables
hardcoded:

library(AntWeb)
threshold <- 0:2
genero <- "crematogaster"
datos <- list()
for (i in threshold){
if (is.null(datos) == FALSE){
starting <- i * 1000
datos <- c(datos, aw_data(genus=genero,
georeferenced=TRUE, offset=starting))}
}

This gives a list, then with a loop like this, it could be done:

for (i in length(datos)){datalist <- do.call("cbind", datos)}

But I can't get it to work properly, is there any aw_cbind function?

And congrats again for your work!

I know the code is ugly and maybe not very useful, only some ideas.
El Tue, 11 Mar 2014 12:14:53 -0700
Karthik Ram notifications@github.com escribió:

Awesome @zagloj,
No worries about being busy -- totally understand. I would love input
from you at a later time and would be happy to have you as a
collaborator.

A quick comment re Crematogaster (or other large genera). It
retrieves only 1000 results the first time but it's super easy to
combine the rest of the results. I just added this feature in
0.5.13 So below, you get 1000 results the first time and 1000 more
a second time, then combine the two into one result. You can easily
map this large request.

x1 <- aw_data(genus = "crematogaster", georeferenced = TRUE)
x2 <- aw_data(genus = "crematogaster", georeferenced = TRUE, offset =
1000) x12 <- aw_cbind(list(x1, x2))
aw_map(x12)

Reply to this email directly or view it on GitHub:
#11 (comment)

I just submitted to the new version to CRAN and it should be possible to get all the data in one go with just a little more code. It's not automated for various reasons, but a little bit of general R code could make this simple (and even parallelize everything so each core can send a separate request and everything is assembled at the end). Here is a quick attempt at solving the issue above:

crem_data <- aw_data(genus = "Crematogaster")
# Query contains 15341 results. First 1000 retrieved. 
# Use the offset argument to retrieve more 

# 15341 results available for query. Downloading 1000

# By default searches only download the first 1k records
# Let's see how many there are on the server

crem_data$count
# [1] 15341

# Great, now let's split this up into bins of a 1000

offsets <- seq(0, crem_data$count, by = 1000)
library(plyr)
results <- llply(offsets, function(x) {
    aw_data(genus = "Crematogaster", offset =x, quiet = TRUE)
}, .progress = "text")

final_results <- aw_cbind(results)
final_results
> final_results
[Total results on the server]: 15341 
[Args]: 
genus = Crematogaster 
limit = 1000 
offset = 1000 
[Dataset]: [15341 x 16] 
[Data preview] :
                                                          url catalogNumber     family  subfamily
1 http://antweb.org/api/v2/?occurrenceId=antweb:casent0057371 casent0057371 formicidae myrmicinae
2 http://antweb.org/api/v2/?occurrenceId=antweb:casent0057378 casent0057378 formicidae myrmicinae
          genus specificEpithet      scientific_name typeStatus stateProvince country dateIdentified
1 Crematogaster           maina  crematogaster maina                  Toliara             2013-01-02
2 Crematogaster          ramamy crematogaster ramamy                  Toliara             2012-12-20
              habitat minimumElevationInMeters geojson.type decimal_latitude decimal_longitude
1      gallery forest                       20        point        -23.03952          43.61015
2 gallery forest, TS1                      222        point        -22.80707          43.76375

Notice that we now have all 15,341 records downloaded.

Also yes, there is a aw_cbind function.

You simply pass on a list of objects of class antweb to this function.

e.g.

final <- aw_cbind(list(x1, x2, x3))

But I will work on solutions to automate this. A small feature I'll add to the next update.

Oops, I just updated and saw the cbind new function, I am sure I
updated a couple days ago or even less, hehe.

And I didn't notice the count variable on the dataframe, I was looking
for that but could not find it. Also you can check dplyr instead plyr,
it seems to be faster sometimes.

Keeping an eye on github next days.

El Wed, 12 Mar 2014 14:59:12 -0700
Karthik Ram notifications@github.com escribió:

Also yes, there is a aw_cbind function.

You simply pass on a list of objects of class antweb to this
function.

e.g.

final <- aw_cbind(list(x1, x2, x3))

Reply to this email directly or view it on GitHub:
#11 (comment)

@zagloj Based on your request, I am adding a new function to download all data without any additional code.

I love dplyr too but it has no equivalent of llply which is really lapply.

Stay tuned!

The new version of AntWeb does all that you requested:

library(AntWeb)
crem <- aw_data_all(genus = "crematogaster", georeferenced = TRUE)

That's it. It will download all 12k + records in one go without you having to write additional code. This will be available through CRAN by tomorrow but you can always install from here.