The code was written within 2 days as a part of my recruiting process at SAP in Nov. 2018.
-
Analyze the book “War and Peace” by Leo Tolstoy with means of Data Science and Machine Learning algorithms.
-
Please use the document „War and Peace“ by Leo Tolstoy at https://en.wikisource.org/wiki/War_and_Peace/Book_One and choose and download the text version using the „Choose format“ link on the left side.
-
Your analysis has to be repeatable. That means the script/program used for the analysis can be applied to the document in question and produce the same results. If you use additional sources in your analysis, please document those sources and their retrieval.
-
You can pick a development environment and programming language of your choice.
-
The only requirement is that you can explain your analysis in 25 minutes (including questions and answers).
-
Data Science often doesn’t start with clear questions. The following is a list of ideas what could be analyzed.
- Word and phrase distribution.
- Identify all the places and characters in the book.
- Identify sentiments of the different sentences and chapters.
- Summarize paragraphs.
- Extract intents from sentences.
- Track key concepts throughout the book.