Erin-Rooney/Y1_fairbanks

Data repetitions

Closed this issue · 2 comments

@kaizadp

Those data repetitions are still happening. This is the first place I see but it could be happening in data processing:

horizonation_relabund =
fticr_data_water %>%
left_join(select(fticr_meta_water, formula, Class), by = "formula") %>%
## create a column for group counts
mutate(Class = factor(Class, levels = c("aliphatic", "unsaturated/lignin", "aromatic", "condensed aromatic"))) %>%
#group_by(slopepos, cover_type, plot, ID, Class) %>%
group_by(ID, Class) %>%
dplyr::summarize(counts = n()) %>%
## create a column for total counts
group_by(ID) %>%
dplyr::mutate(totalcounts = sum(counts)) %>%
ungroup() %>%
mutate(relabund = (counts/totalcounts)*100,
relabund = round(relabund, 2))

Sample IDs
3 & 4,
52 & 53,
70 & 71 & 72,
77 & 78

@kaizadp

Is the issue here? We never talked about this part of the code for the Y1 samples. Some plots will have one rep and others will have 3. It doesn't seem like this should be the issue, but since I did it on my own it's worth taking a look at.

rename(max_reps = reps) %>%
group_by(slopepos, cover_type, plot, formula) %>%
dplyr::mutate(formulareps = n()) %>%
# set up replication filter for 2/3 of max_rep
ungroup() %>%
mutate(include = formulareps >= (2/3)*max_reps) %>%
## mutate(include = formulareps > 1,
## occurrence = case_when(formulareps == max_reps ~ "3/3",
## formulareps < max_reps & formulareps >= (2/3)*max_reps ~ "2/3+",
## formulareps >= (1/3)*max_reps ~ "1/3+",
## formulareps < (1/3)*max_reps ~ "exclude")) %>%
filter(include)

Oooookay...I "fixed" it. I may have created more problems. But the data was happening because of the formula reps line. I changed it to the following. It fixed it...there are no more repetitions. I'm going to start redoing figures...hopefully I haven't messed it up even more.

group_by(slopepos, cover_type, plot, formula) %>%
dplyr::mutate(formulareps = n()) %>%
# set up replication filter for 2/3 of max_rep
ungroup() %>%
mutate(include = formulareps >= 1) %>%
#(mutate(include = formulareps >= 2/3)*max_reps)
## mutate(include = formulareps > 1,
## occurrence = case_when(formulareps == max_reps ~ "3/3",
## formulareps < max_reps & formulareps >= (2/3)*max_reps ~ "2/3+",
## formulareps >= (1/3)*max_reps ~ "1/3+",
## formulareps < (1/3)*max_reps ~ "exclude")) %>%
filter(include)