Rate versus count for trending rate
sjacks26 opened this issue · 0 comments
sjacks26 commented
Right now, FlockWatch uses raw frequency counts from two time windows to identify trending terms. If there are many more messages in t2 than in t1, FlockWatch will find a lot of trending terms (simply because more messages means more opportunities for a term to appear).
Maybe FlockWatch should use frequency rates (normalized by the number of messages in a time window) rather than raw frequency counts?