This is a technical test for data engineering candidates for the Pulsar Team at Dow Jones.
Please write a solution to the data challenge using Scio. Your source files can be found in the datasets directory.
Please complete this challenge using the instructions in the README
Please complete this challenge using the instructions in the following README
Candidates should do this test on their own. It is designed to be done in 2 or 3 hours but there is no hard limit.
Once complete be prepared to step the team through a code review of your solution. Looking forward to seeing what you come up with!