LIAAD/yake

Porting it to other language: Swift for iOS

Opened this issue · 5 comments

I was looking if I can port Yake to iOS/MAC. I am novice in python and data science 😿. I am assuming that the main logic written pke->Yake.py. but then there is one more file Yake->Yake.py. What is the difference between two? I am not finding last one referencing the first one.
Can anyone point me to some more resources which I can read?

https://asset-pdf.scinapse.io/prod/2790109590/2790109590.pdf
https://medium.com/gumgum-tech/exploring-different-keyword-extractors-statistical-approaches-38580770e282

Nice that you are trying to write YAKE in Swift.
I would recommend taking a look at the short paper to have a better understanding of what it does and why.
For more detailed explanation you can also check the journal paper.

If you need more examples besides this repository you can check an alternative Python implementation by Florian

or the Scala implementation by JohnSnow Labs for the SparkNLP framework.

Cheers

Thanks for the links! I want to ask one thing regarding the preprocessing, can we discard the chunks which are un-parsable and digits? I might be wrong here, but I observe that we are not using them in later. I am using NaturalLanguage package from Apple's library and its little different them segtok's web_Tokenizer.

@arianpasquali I am pretty much done with my raw port. but I guess I am missing something here as my results are not matching( at least after top 2-3) specially the candidates with multiple terms.

  • How do we calculate KF ? specially candidate keywords which involves multiple terms like "Anthony Goldbloom declined". I mean this would be one only if repeated one time. Or do we consider constituent frequency also. if yes then how?

Following along here. I'm also hoping to utilize YAKE in Swift. @mrigankgupta I'm happy to help get in to what you've got going

@mrigankgupta Please let me know if you'd like help or a tester! I'm likely going to have to port it over to Swift too and would love if we could share code