Text analytics can be a bit overwhelming and frustrating at times with the unstructured and noisy nature of textual data and the vast amount of information available. "Text Analytics with Python" is a book packed with 385 pages of useful information based on techniques, algorithms, experiences and various lessons learnt over time in analyzing text data. This repository contains datasets and code used in this book. I will also be adding various notebooks and bonus content here from time to time. Keep watching this space!
Help Needed on porting code to Python 3.x. Please check this link if you are interested in contributing. -- To be resumed end of August.
- Add code used in the book
- Add datasets used in the book
- Add book description
- Update chapter descriptions
- Add necessary code comments & documentation
- Add code used in the book ported to Python 3.x (for people using Python 3)
- Add bonus content
Derive useful insights from your data using Python. Learn the techniques related to natural language processing and text analytics, and gain the skills to know which technique is best suited to solve a particular problem.
Text Analytics with Python teaches you both basic and advanced concepts, including text and language syntax, structure, semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization
A structured and comprehensive approach is followed in this book so that readers with little or no experience do not find themselves overwhelmed. You will start with the basics of natural language and Python and move on to advanced analytical and machine learning concepts. You will look at each technique and algorithm with both a bird's eye view to understand how it can be used as well as with a microscopic view to understand the mathematical concepts and to implement them to solve your own problems.
Book Title: Text Analytics with Python Publisher: Apress (a part of Springer) Copyright: Dipanjan Sarkar
Print ISBN: 978-1-4842-2387-1 Online ISBN: 978-1-4842-2388-8 DOI: 10.1007/978-1-4842-2388-8
This book:
- Provides complete coverage of the major concepts and techniques of natural language processing (NLP) and text analytics
- Includes practical real-world examples of techniques for implementation, such as building a text classification system to categorize news articles, analyzing app or game reviews using topic modeling and text summarization, and clustering popular movie synopses and analyzing the sentiment of movie reviews
- Shows implementations based on Python and several popular open source libraries
in NLP and text analytics, such as the natural language toolkit (
nltk
),gensim
,scikit-learn
,spaCy
andpattern
- Chapter 1: Natural Language Basics
- Chapter 2: Python Refresher
- Chapter 3: Processing and Understanding Text
- Chapter 4: Text Classification
- Chapter 5: Text Summarization
- Chapter 6: Text Similarity and Clustering
- Chapter 7: Semantic and Sentiment Analysis