/roadmap-for-NLP

A Natural Language Processing’s roadmap for begginers

MIT LicenseMIT

A Natural Language Processing’s roadmap for begginers 🤖 🤓

📚 Summary

  1. What is NLP?
  2. Why NLP? What is the benefits?
  3. Main uses of NLP
  4. Recommendation of papers on the use of NLP
  5. Best programming Language
  6. Roadmap
  7. Open Datasets for pratical Hands-on projects
  8. Books you should have
  9. Important researchers in this field that you need to follow

💻 1. What is NLP?

According to IBM, Natural language processing (NLP) refers to the branch of computer science—and more specifically, the branch of artificial intelligence or AI—concerned with giving computers the ability to understand text and spoken words in much the same way human beings can.

NLP combines computational linguistics—rule-based modeling of human language—with statistical, machine learning, and deep learning models. Together, these technologies enable computers to process human language in the form of text or voice data and to ‘understand’ its full meaning, complete with the speaker or writer’s intent and sentiment.

👩‍💻 2. Why NLP? What is the benefits?

You have probably wondered what is the benefit of using NLP in systems, let's see a little bit about it 🤩

  1. Perform large-scale analysis: NLP technology allows for text analysis at scale on all manner of documents, internal systems, emails, social media data, online reviews, and more.
  2. Get a more objective and accurate analysis
  3. Improve customer satisfaction
  4. Better understand your market
  5. Get actionable insights

💻 3. Main uses of NLP

  • Email filters
  • Smart assistants
  • Search results
  • Autocomplete and autocorrect text
  • Language translation
  • Chatbots

📃 4. Recommendation of papers on the use of NLP

In addition to the examples above, I'm bringing some super cool research carried out throughout the year that used NLP for different purposes. It's really worth looking at these papers and getting inspired 😉

  1. Balakrishnan, V., Khan, S., Fernandez, T., & Arabnia, H. R. (2019). Cyberbullying detection on twitter using Big Five and Dark Triad features. Personality and individual differences, 141, 252-257.
  2. Yang, X., McEwen, R., Ong, L. R., & Zihayat, M. (2020). A big data analytics framework for detecting user-level depression from social networks. International Journal of Information Management, 54, 102141.
  3. Shi, A., Qu, Z., Jia, Q., & Lyu, C. (2020, November). Rumor detection of COVID-19 pandemic on online social networks. In 2020 IEEE/ACM Symposium on Edge Computing (SEC) (pp. 376-381). IEEE.
  4. Vo, T., Sharma, R., Kumar, R., Son, L. H., Pham, B. T., Tien Bui, D., ... & Le, T. (2020). Crime rate detection using social media of different crime locations and Twitter part-of-speech tagger with Brown clustering. Journal of Intelligent & Fuzzy Systems, 38(4), 4287-4299.
  5. Shu, K., Zhou, X., Wang, S., Zafarani, R., & Liu, H. (2019, August). The role of user profiles for fake news detection. In Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 436-439).

⌨️ 5. Best programming Language for NLP

  1. Python: Python is very popular in this field because of its versatility. And it offers developers a lot of libraries which handle many NLP-related tasks like topic modeling, document classification, sentiment analysis etc.
  2. Java: Java is another commonly used programming language in the field of natural language processing. With the help of this language, you can explore how to organize text utilizing full-text search, information extraction, clustering, and tagging. You can use this libs: OpenNLP, LingPipe and Stanford CoreNLP
  3. R: While R is popular for being used in statistical learning, it’s widely used for natural language processing.

I recommend you start with python because there’re lots of things that make Python the best programming language for a natural language processing project

Popular libraries in python for you learn:

  • NLTK
  • spaCy
  • Core NLP
  • Text Blob
  • PyNLPI
  • Gensim
  • Pattern

✅ 6. Roadmap

Prerequisite: It's very importat you have some knowledge in machine learning, especially supervised learning.

🔍 7. Open Datasets for pratical Hands-on projects

General

Sentiment Analysis

Text

😎 8. Books you should have

  • Natural Language Processing with python
  • Pratical Natural Language Processing
  • Natural Language Processing in action
  • Text Mining with R
  • Applied Text Analysis with python
  • Natural Language Processing with PyTorch
  • Deep Learning with Text

👩‍🔬 9. Important researchers in this field that you need to follow

All these researchers had more important contributions to the academic environment, and were highly cited in Google scholar.

References:

https://www.analyticsvidhya.com/blog/2022/01/roadmap-to-master-nlp-in-2022/

https://www.linkedin.com/pulse/nlp-roadmap-machine-learning-2022-arya-soni/?trk=articles_directory

https://www.ibm.com/cloud/learn/natural-language-processing

https://monkeylearn.com/blog/nlp-benefits/

https://odsc.medium.com/20-open-datasets-for-natural-language-processing-538fbfaf8e38