At Landing.jobs we're trying to match great jobs with the best candidates. However we're still missing a lot of information on the behavior of our users. For example, we currently have no information on why certain people apply and others don't.
You have data from the past year and a selection of our users. Can you create a model to predict which ones would apply at '2019-03-13'? You will be evaluated using the F1 Score.
The submitted file should be a .csv
that has one column
- person_id -> With the ids of the people that applied at '2019-03-13'
The score should be based on the F1 score when comparing the submitted IDs with what actually happened.
In the data folder, there is a file called "solution.csv" containing the actual solution we are looking for with this challenge - the real list of people that applied on '2019-03-13'. You may use it (carefully) to check how you are doing and to present your final F1 score obtained.
You will have 6 tables availables, their contents are the following:
person_id
: ID of the candidateid
: ID of the applicationjob_ad_id
: ID of the associated Jobsubmitted_at
: When the application was submittedcreated_at
: When the application was created (They can also be drafts)
id
: ID of the jobcompany_id
: ID of the companyexperience_level
: Experience bucket of the joblast_published_at
: last time the job was publishedclosed_at
: Time the job was closed at
Experience can be inside the following buckets:
- 1 - Junior - Less than 2 years of experience
- 2 - Intermediate - 2 to 6 years of experience
- 3 - Senior - More than 6 years of experience
user_id
: ID of the userid
: ID of the candidatecountry_code
: Country codeexperience_level
: Experience level from 0 to 10+person_created_at
: when the user was createdavailability
: Category of availabilityremote
: Category of remote
Availability categories
- "I'm not really looking, just curious" => 0,
- "I'm actively looking for a job" => 1,
- "I'm currently employed, but open to a new challenge" => 2
Remote Categories
- 'Yes' => 1,
- 'Remote positions only' => 2,
- 'No' => 0
job_id
: ID of the jobcanonical_tag_id
: ID of the skilltag_name
: Name of the skill
person_id
: ID of the personcanonical_tag_id
: ID of the skilltag_name
: Name of the skill
id
: ID of the viewtime
: When the visit happenedpage
: Page that was visiteduser_id
: ID of the user that has visitedvisit_id
: ID of the visit
Some information: View -> Single page click Visit -> Collection of views Users -> Both people (candidates) and employees
The page
field may not make a lot of sense but it may be in part because it's anonymized. Here are some common page strings:
at/:company_id/:job_id
-> Visiting a job page- ``-> Homepage
- Are all users from the views people?
- How much data do you actually need?