ericksonlopes/Words-In-Songs

[SUGESTÃO] Usar Hyper LogLog para analisar alto volume de dados

Opened this issue · 2 comments

Utilizar Hyper LogLog para analisar um alto volume de palavras para identificar trechos únicos

Referência : https://medium.com/botify-labs/hyperloglog-or-how-we-estimate-large-numbers-of-unique-urls-cd4d1769261f

@ericksonlopes Qual sua opinião?

Referência: https://github.com/svpcom/hyperloglog

Exemplo:
import hyperloglog hll = hyperloglog.HyperLogLog(0.01) # accept 1% counting error hll.add("hello") print len(hll) # 1 hll.add("hello") print len(hll) # 1 as items aren't added more than once hll.add("hello again") print len(hll) # 2