make a function to get utterances surrounding a token ("context")
Closed this issue · 4 comments
this feature was originally requested by MH, and Dan proposed a manual solution. i'm just adding it here in case we want to build this functionality into the API.
his note, "Suppose I have a list of tokens or utterances, and I want to read the preceding context (like the last N utterances).. is there a way to do this in childesr?"
Dan's solution:
Shem_utts <- get_utterances(child = "Shem", age = 30)
dog_utts_inds <- filter(Shem_utts, str_detect(gloss, "dog")) %>%
pull(order)
pre_dog_utts <- map(dog_utts_inds,
function(index) filter(Shem_utts, order > (index - 5) & order < index )) %>%
bind_rows()
An idea for a bonus feature: have an option to merge contexts when the token appears again in the context. I suspect this will be a general interest feature because parents often repeat back to their kids what they said. For instance, looking for "big"... you get a lot of back and forth usages of "big" (is that a big one? its a big one.) The naive manual solution will extract a bunch of repeat contexts, which may not be what you want.
consensus:
- this is useful, dan's solution above is good.
- could use the IDs on the backend to be efficient
@amsan7 could you please add an index on the utterance_order
column of the utterance
table for this is to be efficient?
done!