- What is NLP?
- Why NLP? What is the benefits?
- Main uses of NLP
- Recommendation of papers on the use of NLP
- Best programming Language
- Roadmap
- Open Datasets for pratical Hands-on projects
- Books you should have
- Important researchers in this field that you need to follow
According to IBM, Natural language processing (NLP) refers to the branch of computer science—and more specifically, the branch of artificial intelligence or AI—concerned with giving computers the ability to understand text and spoken words in much the same way human beings can.
NLP combines computational linguistics—rule-based modeling of human language—with statistical, machine learning, and deep learning models. Together, these technologies enable computers to process human language in the form of text or voice data and to ‘understand’ its full meaning, complete with the speaker or writer’s intent and sentiment.
You have probably wondered what is the benefit of using NLP in systems, let's see a little bit about it 🤩
- Perform large-scale analysis: NLP technology allows for text analysis at scale on all manner of documents, internal systems, emails, social media data, online reviews, and more.
- Get a more objective and accurate analysis
- Improve customer satisfaction
- Better understand your market
- Get actionable insights
- Email filters
- Smart assistants
- Search results
- Autocomplete and autocorrect text
- Language translation
- Chatbots
In addition to the examples above, I'm bringing some super cool research carried out throughout the year that used NLP for different purposes. It's really worth looking at these papers and getting inspired 😉
- Balakrishnan, V., Khan, S., Fernandez, T., & Arabnia, H. R. (2019). Cyberbullying detection on twitter using Big Five and Dark Triad features. Personality and individual differences, 141, 252-257.
- Yang, X., McEwen, R., Ong, L. R., & Zihayat, M. (2020). A big data analytics framework for detecting user-level depression from social networks. International Journal of Information Management, 54, 102141.
- Shi, A., Qu, Z., Jia, Q., & Lyu, C. (2020, November). Rumor detection of COVID-19 pandemic on online social networks. In 2020 IEEE/ACM Symposium on Edge Computing (SEC) (pp. 376-381). IEEE.
- Vo, T., Sharma, R., Kumar, R., Son, L. H., Pham, B. T., Tien Bui, D., ... & Le, T. (2020). Crime rate detection using social media of different crime locations and Twitter part-of-speech tagger with Brown clustering. Journal of Intelligent & Fuzzy Systems, 38(4), 4287-4299.
- Shu, K., Zhou, X., Wang, S., Zafarani, R., & Liu, H. (2019, August). The role of user profiles for fake news detection. In Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 436-439).
- Python: Python is very popular in this field because of its versatility. And it offers developers a lot of libraries which handle many NLP-related tasks like topic modeling, document classification, sentiment analysis etc.
- Java: Java is another commonly used programming language in the field of natural language processing. With the help of this language, you can explore how to organize text utilizing full-text search, information extraction, clustering, and tagging. You can use this libs: OpenNLP, LingPipe and Stanford CoreNLP
- R: While R is popular for being used in statistical learning, it’s widely used for natural language processing.
I recommend you start with python because there’re lots of things that make Python the best programming language for a natural language processing project
Popular libraries in python for you learn:
- NLTK
- spaCy
- Core NLP
- Text Blob
- PyNLPI
- Gensim
- Pattern
Prerequisite: It's very importat you have some knowledge in machine learning, especially supervised learning.
- Yelp Reviews
- Dictionaries for Movies and Finance
- OpinRank Dataset
- Amazon Reviews
- Portuguese Tweets
- Financial News
- Women's E-Commerce Clothing Reviews
- Rick&Morty Scripts
- The WikiQA Corpus
- European Parliament Proceedings Parallel Corpus
- Jeopardy
- Legal Case Reports Dataset
- SciFi Stories Text Corpus
- Ecommerce Text Classification
- Tweets about Lord of the Rings: The Rings of Power
- Natural Language Processing with python
- Pratical Natural Language Processing
- Natural Language Processing in action
- Text Mining with R
- Applied Text Analysis with python
- Natural Language Processing with PyTorch
- Deep Learning with Text
All these researchers had more important contributions to the academic environment, and were highly cited in Google scholar.
References:
https://www.analyticsvidhya.com/blog/2022/01/roadmap-to-master-nlp-in-2022/
https://www.linkedin.com/pulse/nlp-roadmap-machine-learning-2022-arya-soni/?trk=articles_directory
https://www.ibm.com/cloud/learn/natural-language-processing
https://monkeylearn.com/blog/nlp-benefits/
https://odsc.medium.com/20-open-datasets-for-natural-language-processing-538fbfaf8e38