/autodl

A machine learning competition in Automated Deep Learning (AutoDL), co-organized by ChaLearn, Google and 4Paradigm. Accepted at NeurIPS 2019.

Primary LanguagePythonApache License 2.0Apache-2.0

AutoDL

A data challenge in Automated Deep Learning (AutoDL), organized by ChaLearn, Google and 4Paradigm.

Build Status

How to contribute

See instructions

To Prepare a Competition Bundle and Create a Copy of AutoDL competition

Please run:

git clone https://github.com/zhengying-liu/autodl.git
cd autodl/codalab_competition_bundle/utilities/
./make_competition_bundle.sh

then you'll see a zip file created in the directory utilities/. Upload it to a CodaLab server (such as this one) in the tag 'Create Competition' and bang!

Tips

This competition bundle is to be run on a specialized CodaLab instance such as this one and NOT on the usual server.

The major differences of the specialized CodaLab instance (compared to the usual one) are:

  1. Parallel tracks are implemented using one parent phase and several child phases. While one submission is made to the parent phase, one copy of this submission will be made to each child phase. For example, suppose we have a parent phase called 'All datasets', which has 5 child phases 'Dataset 1', 'Dataset 2', ..., 'Dataset 5'. Then when a submission is made to 'All datasets', 5 copies of this submission will be made to the 5 children phases, which creates 5+1=6 jobs (a.k.a runs). Each job can have an ingestion step and/or a scoring step. Child jobs will be handled by several workers in a parallel and asynchronous way. When the 5 child jobs terminate, the parent job will receive their results (output) and will then be launched.
  2. Instead of having several tasks (defined by several datasets) in a single job, now only one task (thus one dataset) is found in each job. Thus to prepare zip files of the datasets, you only need to zip the content (not the directory) of a directory
./
├── munster.data/
├── public.info (optional)

for Input data and the content of a directory

./
├── munster.solution

for Reference data. Thus 10 zip files (5 data zip files + 5 solution zip files) should be created and uploaded to CodaLab for 5 child phases. Note that this is different from the usual server where the data of several tasks are zipped in a single zip file (and another zip file for all solutions).

  1. Ingestion program and scoring program can now be executed parallelly, as long as the feature Ingestion only during scoring is checked in the Editor page of the competition (or in the yaml file of the competition). When this feature is on, scoring will be launched at the same time as ingestion. When ingestion exceeds time budget limit, scoring will kill ingestion using its PID.

  2. Real-time feedback is implemented using the detailed_results.html page generated by scoring program. This page can be viewed by clicking on 'Learning Curve' button in 'My Submissions' tab or 'Detailed Results' button in detailed results in each submission. This feature is useful for participants to view the performance of their algorithm even if the job has not terminated yet.

Usefuls links: