BabyLM Challenge Pre-training language models with limited data Dataset exploration Tokenizer analysis Baseline training Tuning with task rewards