TODO1 :
You have Features.csv data set, in F2/Data Set repository, use Pandas library for handle these cases:
- Replace the empty values in the actions column with zero
- Print the summation of the followers for all tweets by using reduce stream
- Print the tweets id that location of it ‘UK’
- Print how many spam tweets in this data set
- Print the tweets id that following of it more than 5000
- Print all tweets id that include ‘#’ in the tweet text
- Print all tweets id that include URL in the tweet text by using regex - Optional Task 5
TODO2 :
- Create a simple String calculator with a method int Add(string numbers). The method can take 0, 1 or 2 numbers, and will return their sum (for an empty string it will return 0) for example “” or “1” or “1,2” Start with the simplest test case of an empty string and move to 1 and two numbers
- Allow the Add method to handle an unknown amount of numbers
- Allow the Add method to handle new lines between numbers (instead of commas). The following input is ok: “1\n2,3” (will equal 6) The following input is NOT ok: “1,\n” (not need to prove it - just clarifying)
- Support different delimiters to change a delimiter, the beginning of the string will contain a separate line that looks like this: “//[delimiter]\n[numbers…]” for example “//;\n1;2” should return three where the default delimiter is ‘;’ . the first line is optional. all existing scenarios should still be supported
- Calling Add with a negative number will throw an exception “negatives not allowed” - and the negative that was passed.if there are multiple negatives, show all of them in the exception message.
- Numbers bigger than 1000 should be ignored, so adding 2 + 1001 = 2.
- Delimiters can be of any length with the following format: “//[delimiter]\n” for example: “//[***]\n1***2***3” should return 6.
- Allow multiple delimiters like this: “//[delim1][delim2]\n” for example “//[*][%]\n1*2%3” should return 6.
- Make sure you can also handle multiple delimiters with length longer than one char