Bayesian(,) pl(ea)s(e)!

Bayesian Pseudo-Label Selection (Bayesian-PLS)

Introduction, TOC

This repository contains code for Selecting Pseudo-Labels the Bayesian way, as introduced in the paper "Bayes Optimal Pseudo-Label Selection for Semi-Supervised Learning"

  • R contains implementation of BPLS with PPP and alternative PLS methods to benchmark against
  • benchmarking provides files for experiments (section 4), in order to reproduce results, see setup below
  • data contains real-world data used in experiments
  • experimental results and visualization thereof will be saved in plots and results

Tested with

  • R 4.2.0
  • R 4.1.6
  • R 4.0.3

on

  • Linux Ubuntu 20.04
  • Linux Debian 10
  • Windows 11 Pro Build 22H2

Setup

First and foremost, please install all dependencies by sourcing this file.

Then download the implementations of BPLS with PPP and concurring PLS methods and save in a folder named "R":

In order to reproduce the papers' key results (and visualizations thereof) further download these scripts and save in respective folder:

Eventually, download benchmarks/experiments_simulated_data.R and run from benchmarks/ (estimated runtime: 30 CPU hours)

Important: Create empty folders results and plots where experimental results will be stored automatically. In addition, you can access them as object after completion of the experiments.

Further experiments

Additional experimental setups can now easily be created by modifying benchmarks/experiments_simulated_data.R

Data

Find data and files to read in data in folder data.