Posts-Analyser

Post analysis (and possibly prediction) for questions/answers from Stack Overflow.

Plan for Analysis of Posts

Get data into a Pandas Dataframe
Use the following to use as features

Word count
Readability index (find the specifics)
Contains Source Code
Contains Latex math
Sentiment Analysis (Possibly predictive, but more intended for descriptive)

Run a Predictive Analysis
Also Consider Descriptive Analysis of Different Stackexchange Communities.

Readability Indices

Flesch-Kincaid Grade

Gunning Fog Index

Coleman-Liau Index (Using)

SMOG Index

Automated Readability Index

Flesch-Kincaid Reading Ease (Using)

Spache Score

New Dale-Chall Score (Using)

Other Measures

Code Count (whether code is present)

Latex Count (whether Latex code is present)

Punctuation Count (how much punctuation is present)

Cleaned Text (Usable for sentiment analysis)

Resources

Datasets

Kaggle