- Import-/ File-Parsing Challenge
- Notes
- Output
You are working on an application to build survey software. There is a legacy system that you need to import data from. Survey responses are arranged in a flat file (TSV - tab separated), with a row per user and a column per question.
q14 | q17 | q18 | q18 | q23 | |
---|---|---|---|---|---|
u1 | 1 | 3 | 1 | 2 | 4 |
u3 | 3 | 6 | 2 | 4 | 3 |
u4 | 2 | 2 | 6 | 46 | 2 |
There are 50,000 users and 400 questions in the spreadsheet (question ids do not repeat). Your task is to import the spreadsheet into this schema (assume that user and question are prepopulated):
Please write code to do this. You can stub out the database access class.
I tried to follow this pattern:
- Make it work.
- Clean up the code.
- Make it fast.
I focused on TDD'ing everything, even the rake task.
Parser.rb
is completely file-agnostic as it is only being passed a row
.
Performance could definitely be improved. As of right now, I am calling a fairly expensive double loop:
I am looping over every row and every question. Also, in parser.rb
I could have chosen a different approach and not map (.map(&:to_i
).
However, this will run perfectly fine as a background task where performance does not really matter. 🤘
{:question_id=>14, :user_id=>1, :response=>1}
{:question_id=>17, :user_id=>1, :response=>4}
{:question_id=>34, :user_id=>1, :response=>5}
{:question_id=>14, :user_id=>2, :response=>5}
{:question_id=>17, :user_id=>2, :response=>3}
{:question_id=>34, :user_id=>2, :response=>2}
{:question_id=>14, :user_id=>3, :response=>3}
{:question_id=>17, :user_id=>3, :response=>1}
{:question_id=>34, :user_id=>3, :response=>1}
=> nil