Pinned Repositories
Call-Center-Attrition-Prediction-
Chunk-particular-Parts-of-Speech-using-Python
Chunk particular Parts-of-Speech
courses
Course materials for the Data Science Specialization: https://www.coursera.org/specialization/jhudatascience/1
Display-Google-News-in-Python
feature-engineering-in-R
Repo of all FE possibilities in R
Integrate-Gmail-with-R
Method on how to read Email(Gmail) message from your inbox to R
Main-Topic-Identifier-in-an-essay-or-article-
Main Topic Identifier in an essay or article
R-sample-code
Scrap-job-listing-and-get-job-details
Scrap job listing and get job details
Simple-Summarizing-tool-using-Python
The intersection function: This function receives two sentences, and returns a score for the intersection between them. We just split each sentence into words/tokens, count how many common tokens we have, and then we normalize the result with the average length of the two sentences. Computer Science: f(s1, s2) = |{w | w in s1 and w in s2}| / ((|s1| + |s2|) / 2) The sentences dictionary: This part is actually the “Heart” of the algorithm. It receives our text as input, and calculates a score for each sentence. The calculations is composed of two steps: In the first step we split the text into sentences, and store the intersection value between each two sentences in a matrix (two-dimensional array). So values[0][2] will hold the intersection score between sentence #1 and sentence #3. Computer Science: In fact, we just converted our text into a fully-connected weighted graph! Each sentence is a node in the graph and the two-dimensional array holds the weight of each edge! In the second step we calculate an individual score for each sentence and store it in a key-value dictionary, where the sentence itself is the key and the value is the total score. We do that just by summing up all its intersections with the other sentences in the text (not including itself). Computer Science: We calculates the score for each node in our graph. We simply do that by summing all the edges that are connected to the node. Building the summary: Obviously, the final step of our algorithm is generating the final summary. We do that by splitting our text into paragraphs, and then we choose the best sentence from each paragraph according to our sentences dictionary. Computer Science: The Idea here is that every paragraph in the text represents some logical subset of our graph, so we just pick the most valuable node from each subset!
yrgowtham's Repositories
yrgowtham/Simple-Summarizing-tool-using-Python
The intersection function: This function receives two sentences, and returns a score for the intersection between them. We just split each sentence into words/tokens, count how many common tokens we have, and then we normalize the result with the average length of the two sentences. Computer Science: f(s1, s2) = |{w | w in s1 and w in s2}| / ((|s1| + |s2|) / 2) The sentences dictionary: This part is actually the “Heart” of the algorithm. It receives our text as input, and calculates a score for each sentence. The calculations is composed of two steps: In the first step we split the text into sentences, and store the intersection value between each two sentences in a matrix (two-dimensional array). So values[0][2] will hold the intersection score between sentence #1 and sentence #3. Computer Science: In fact, we just converted our text into a fully-connected weighted graph! Each sentence is a node in the graph and the two-dimensional array holds the weight of each edge! In the second step we calculate an individual score for each sentence and store it in a key-value dictionary, where the sentence itself is the key and the value is the total score. We do that just by summing up all its intersections with the other sentences in the text (not including itself). Computer Science: We calculates the score for each node in our graph. We simply do that by summing all the edges that are connected to the node. Building the summary: Obviously, the final step of our algorithm is generating the final summary. We do that by splitting our text into paragraphs, and then we choose the best sentence from each paragraph according to our sentences dictionary. Computer Science: The Idea here is that every paragraph in the text represents some logical subset of our graph, so we just pick the most valuable node from each subset!
yrgowtham/Call-Center-Attrition-Prediction-
yrgowtham/Chunk-particular-Parts-of-Speech-using-Python
Chunk particular Parts-of-Speech
yrgowtham/courses
Course materials for the Data Science Specialization: https://www.coursera.org/specialization/jhudatascience/1
yrgowtham/Display-Google-News-in-Python
yrgowtham/feature-engineering-in-R
Repo of all FE possibilities in R
yrgowtham/Integrate-Gmail-with-R
Method on how to read Email(Gmail) message from your inbox to R
yrgowtham/Main-Topic-Identifier-in-an-essay-or-article-
Main Topic Identifier in an essay or article
yrgowtham/R-sample-code
yrgowtham/Scrap-job-listing-and-get-job-details
Scrap job listing and get job details
yrgowtham/Summarizing-customer-review-using-NLTK-and-Sentiment-API
Summarizing customer review using NLTK and Sentiment API
yrgowtham/Titanic-Machine-Learning-from-Disaster
The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class.
yrgowtham/Twitter-Data-Analysis-using-R
Twitter Data Analysis using R
yrgowtham/Word_Analytics