Helper to pick sufficiently abundant species

Question

Helper to pick sufficiently abundant species

maurolepore opened this issue 6 years ago · 3 comments

Sabrina Russo said:

... I guess there’s a wrapper function in which the user can pick what minimum tree abundance they wish to use and that then subsets the data so that only the species meeting the minimum abundance criterion are run for the TT? If not, then there should be one.

Answer 1 · 2019-04-26T16:04:26.000Z

@srusso2,

RE: "I guess there’s a wrapper function in which the user can pick what minimum tree abundance they wish..."

filter(add_count(...), n > ...) does the trick, and I show it in the examples of tt_test(). I'm inclined to now wrap it because this is is a useful, common pattern and I would like to encourage people to learn it so they can use it in other contexts.

suppressPackageStartupMessages({
  library(dplyr)
  library(fgeo.analyze)
})

census <- fgeo.data::luquillo_tree6_1ha %>% 
  filter(status == "A", dbh >= 10)
census
#> # A tibble: 2,319 x 19
#>    treeID stemID tag   StemTag sp    quadrat    gx    gy MeasureID CensusID
#>     <int>  <int> <chr> <chr>   <chr> <chr>   <dbl> <dbl>     <int>    <int>
#>  1     50 165123 1000~ 178258  PSYB~ 921      165.  418.    618386        6
#>  2     67     92 1000~ 100043  CORB~ 921      163.  420.    617072        6
#>  3     82    112 1000~ 100061  CASS~ 921      161.  416.    617074        6
#>  4     85    115 1000~ 100064  MANB~ 921      161.  418.    617075        6
#>  5    102    141 1000~ 100088  SCHM~ 921      163.  411.    617058        6
#>  6    111    150 1000~ 100098  CECS~ 921      162.  410.    617059        6
#>  7    115    154 1001~ 100100  CECS~ 921      163.  410.    617060        6
#>  8    119    158 1001~ 100104  MYRS~ 1021     183.  410.    578696        6
#>  9    120    159 1001~ 100105  OCOL~ 1021     182.  410.    578697        6
#> 10    130    169 1001~ 100114  OCOL~ 1021     181.  409.    578682        6
#> # ... with 2,309 more rows, and 9 more variables: dbh <dbl>, pom <chr>,
#> #   hom <dbl>, ExactDate <date>, DFstatus <chr>, codes <chr>,
#> #   nostems <dbl>, status <chr>, date <dbl>

census %>% 
  count(sp)
#> # A tibble: 70 x 2
#>    sp         n
#>    <chr>  <int>
#>  1 ALCFLO    11
#>  2 ALCLAT    15
#>  3 ANDINE     1
#>  4 ANTOBT     1
#>  5 ARDGLA     1
#>  6 BUCTET    11
#>  7 BYRSPI    25
#>  8 CALCAL     2
#>  9 CASARB   489
#> 10 CASSYL    58
#> # ... with 60 more rows

# Pick species with over 50 individuals
sufficiently_abundant <- census %>% 
  add_count(sp) %>% 
  filter(n > 50)

sufficiently_abundant %>% 
  count(sp)
#> # A tibble: 11 x 2
#>    sp         n
#>    <chr>  <int>
#>  1 CASARB   489
#>  2 CASSYL    58
#>  3 CECSCH    76
#>  4 INGLAU    89
#>  5 MANBID   113
#>  6 OCOLEU    85
#>  7 PREMON   507
#>  8 PSYBER   125
#>  9 PSYBRA    66
#> 10 SCHMOR   151
#> 11 SLOBER    61

Created on 2019-04-26 by the reprex package (v0.2.1)

Answer 2 · 2019-04-26T16:25:55.000Z

Hi Mauro, I am totally triaging my time right now, so I wanted to warn you that it may take me a few weeks to reply on this. You might ask Daniel, too – since this function is not at the top of my mind all the time, it takes me a while to remember what’s going on with it! From: Mauro Lepore <notifications@github.com> Sent: Friday, April 26, 2019 11:07 AM To: forestgeo/fgeo.analyze <fgeo.analyze@noreply.github.com> Cc: Sabrina Russo <srusso2@unl.edu>; Mention <mention@noreply.github.com> Subject: Re: [forestgeo/fgeo.analyze] Helper to pick sufficiently abundant species (#93) @srusso2<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_srusso2&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=FoyO-T8o-OWr_WVQkAXxXRusqFeQsx8ad9nINaFJYa4&e=>, RE: "I guess there’s a wrapper function in which the user can pick what minimum tree abundance they wish..." filter(add_count(...), n > ...) does the trick, and I show it in the examples of tt_test(). I'm inclined to now wrap it because this is is a useful, common pattern and I would like to encourage people to learn it so they can use it in other contexts. suppressPackageStartupMessages({ library(dplyr) library(fgeo.analyze) }) census <- fgeo.data::luquillo_tree6_1ha %>% filter(status == "A", dbh >= 10) census #> # A tibble: 2,319 x 19 #> treeID stemID tag StemTag sp quadrat gx gy MeasureID CensusID #> <int> <int> <chr> <chr> <chr> <chr> <dbl> <dbl> <int> <int> #> 1 50 165123 1000~ 178258 PSYB~ 921 165. 418. 618386 6 #> 2 67 92 1000~ 100043 CORB~ 921 163. 420. 617072 6 #> 3 82 112 1000~ 100061 CASS~ 921 161. 416. 617074 6 #> 4 85 115 1000~ 100064 MANB~ 921 161. 418. 617075 6 #> 5 102 141 1000~ 100088 SCHM~ 921 163. 411. 617058 6 #> 6 111 150 1000~ 100098 CECS~ 921 162. 410. 617059 6 #> 7 115 154 1001~ 100100 CECS~ 921 163. 410. 617060 6 #> 8 119 158 1001~ 100104 MYRS~ 1021 183. 410. 578696 6 #> 9 120 159 1001~ 100105 OCOL~ 1021 182. 410. 578697 6 #> 10 130 169 1001~ 100114 OCOL~ 1021 181. 409. 578682 6 #> # ... with 2,309 more rows, and 9 more variables: dbh <dbl>, pom <chr>, #> # hom <dbl>, ExactDate <date>, DFstatus <chr>, codes <chr>, #> # nostems <dbl>, status <chr>, date <dbl> census %>% count(sp) #> # A tibble: 70 x 2 #> sp n #> <chr> <int> #> 1 ALCFLO 11 #> 2 ALCLAT 15 #> 3 ANDINE 1 #> 4 ANTOBT 1 #> 5 ARDGLA 1 #> 6 BUCTET 11 #> 7 BYRSPI 25 #> 8 CALCAL 2 #> 9 CASARB 489 #> 10 CASSYL 58 #> # ... with 60 more rows # Pick species with over 50 individuals sufficiently_abundant <- census %>% add_count(sp) %>% filter(n > 50) sufficiently_abundant %>% count(sp) #> # A tibble: 11 x 2 #> sp n #> <chr> <int> #> 1 CASARB 489 #> 2 CASSYL 58 #> 3 CECSCH 76 #> 4 INGLAU 89 #> 5 MANBID 113 #> 6 OCOLEU 85 #> 7 PREMON 507 #> 8 PSYBER 125 #> 9 PSYBRA 66 #> 10 SCHMOR 151 #> 11 SLOBER 61 Created on 2019-04-26 by the reprex package<https://urldefense.proofpoint.com/v2/url?u=https-3A__reprex.tidyverse.org&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=h4s5DxcqWiAltsIqbRVSI3eem3Yl_SazbaycUUJA4_0&e=> (v0.2.1) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_forestgeo_fgeo.analyze_issues_93-23issuecomment-2D487111258&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=NhvrnkJE4KGtaRu2wS3yaU57B4DBu_m8wxaMFO9OP7w&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AHHXIUDWPOJFRHD5XE5GJFLPSMSA7ANCNFSM4HBFW27Q&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=OgEHrFRWhMqHCGeg0vcxkERTxJ_wmDmIa6P5TfMlZXo&e=>.

Answer 3 · 2019-04-26T18:08:27.000Z

Don't worry. This needs no action. I just wanted to keep you posted. Mauro

…

On Fri, Apr 26, 2019 at 12:25 PM srusso2 ***@***.***> wrote: Hi Mauro, I am totally triaging my time right now, so I wanted to warn you that it may take me a few weeks to reply on this. You might ask Daniel, too – since this function is not at the top of my mind all the time, it takes me a while to remember what’s going on with it! From: Mauro Lepore ***@***.***> Sent: Friday, April 26, 2019 11:07 AM To: forestgeo/fgeo.analyze ***@***.***> Cc: Sabrina Russo ***@***.***>; Mention ***@***.***> Subject: Re: [forestgeo/fgeo.analyze] Helper to pick sufficiently abundant species (#93) @srusso2< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_srusso2&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=FoyO-T8o-OWr_WVQkAXxXRusqFeQsx8ad9nINaFJYa4&e=>, RE: "I guess there’s a wrapper function in which the user can pick what minimum tree abundance they wish..." filter(add_count(...), n > ...) does the trick, and I show it in the examples of tt_test(). I'm inclined to now wrap it because this is is a useful, common pattern and I would like to encourage people to learn it so they can use it in other contexts. suppressPackageStartupMessages({ library(dplyr) library(fgeo.analyze) }) census <- fgeo.data::luquillo_tree6_1ha %>% filter(status == "A", dbh >= 10) census #> # A tibble: 2,319 x 19 #> treeID stemID tag StemTag sp quadrat gx gy MeasureID CensusID #> <int> <int> <chr> <chr> <chr> <chr> <dbl> <dbl> <int> <int> #> 1 50 165123 1000~ 178258 PSYB~ 921 165. 418. 618386 6 #> 2 67 92 1000~ 100043 CORB~ 921 163. 420. 617072 6 #> 3 82 112 1000~ 100061 CASS~ 921 161. 416. 617074 6 #> 4 85 115 1000~ 100064 MANB~ 921 161. 418. 617075 6 #> 5 102 141 1000~ 100088 SCHM~ 921 163. 411. 617058 6 #> 6 111 150 1000~ 100098 CECS~ 921 162. 410. 617059 6 #> 7 115 154 1001~ 100100 CECS~ 921 163. 410. 617060 6 #> 8 119 158 1001~ 100104 MYRS~ 1021 183. 410. 578696 6 #> 9 120 159 1001~ 100105 OCOL~ 1021 182. 410. 578697 6 #> 10 130 169 1001~ 100114 OCOL~ 1021 181. 409. 578682 6 #> # ... with 2,309 more rows, and 9 more variables: dbh <dbl>, pom <chr>, #> # hom <dbl>, ExactDate <date>, DFstatus <chr>, codes <chr>, #> # nostems <dbl>, status <chr>, date <dbl> census %>% count(sp) #> # A tibble: 70 x 2 #> sp n #> <chr> <int> #> 1 ALCFLO 11 #> 2 ALCLAT 15 #> 3 ANDINE 1 #> 4 ANTOBT 1 #> 5 ARDGLA 1 #> 6 BUCTET 11 #> 7 BYRSPI 25 #> 8 CALCAL 2 #> 9 CASARB 489 #> 10 CASSYL 58 #> # ... with 60 more rows # Pick species with over 50 individuals sufficiently_abundant <- census %>% add_count(sp) %>% filter(n > 50) sufficiently_abundant %>% count(sp) #> # A tibble: 11 x 2 #> sp n #> <chr> <int> #> 1 CASARB 489 #> 2 CASSYL 58 #> 3 CECSCH 76 #> 4 INGLAU 89 #> 5 MANBID 113 #> 6 OCOLEU 85 #> 7 PREMON 507 #> 8 PSYBER 125 #> 9 PSYBRA 66 #> 10 SCHMOR 151 #> 11 SLOBER 61 Created on 2019-04-26 by the reprex package< https://urldefense.proofpoint.com/v2/url?u=https-3A__reprex.tidyverse.org&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=h4s5DxcqWiAltsIqbRVSI3eem3Yl_SazbaycUUJA4_0&e=> (v0.2.1) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_forestgeo_fgeo.analyze_issues_93-23issuecomment-2D487111258&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=NhvrnkJE4KGtaRu2wS3yaU57B4DBu_m8wxaMFO9OP7w&e=>, or mute the thread< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AHHXIUDWPOJFRHD5XE5GJFLPSMSA7ANCNFSM4HBFW27Q&d=DwMFaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=ozUfXq8GhmyNFrTdmFFL6Q&m=8N_9JBbE9EeuqUdb6LqKhwawfqfz6J4JmQr6DAbAhuQ&s=OgEHrFRWhMqHCGeg0vcxkERTxJ_wmDmIa6P5TfMlZXo&e=>. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#93 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABMV2IMS3SY2MANZJVNECXLPSMUJHANCNFSM4HBFW27Q> .