- Obtained~700,000 news articles from various data sources, by building a web scrapper on Python.
- Tokenized words and visualized text patterns on Pyspark, giving insights on the content of news present.
- Built a Sequence to Sequence LSTM, to generate headlines for given articles, optimizing it using Stochastic gradient descent.