/CrowdLayer

A neural network layer that enables training of deep neural networks directly from crowdsourced labels (e.g. from Amazon Mechanical Turk) or, more generally, labels from multiple annotators with different biases and levels of expertise.

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

CrowdLayer

A neural network layer that enables training of deep neural networks directly from crowdsourced labels (e.g. from Amazon Mechanical Turk) or, more generally, labels from multiple annotators with different biases and levels of expertise, as proposed in the paper:

Rodrigues, F. and Pereira, F. Deep Learning from Crowds. In Proc. of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18).

This implementation is based on Keras and Tensorflow.

Usage

Classification

Using the crowd layer in your own Keras deep neural networks for classification problems is very simple. For example, given a sequential model in Keras, you just need to add a "CrowdsClassification" layer as the last layer of the model (on top of what would normally be your output layer, e.g. "Dense" with softmax activation) and use a specialized cross-entropy loss to handle missing answers (encoded with "-1"):

# build your own base model for classification
model = Sequential()
model.add(...) 
...

# add crowd layer on top of the base model
model.add(CrowdsClassification(N_CLASSES, N_ANNOT, conn_type="MW"))

# instantiate specialized masked loss to handle missing answers
loss = MaskedMultiCrossEntropy().loss

# compile model with masked loss and train
model.compile(optimizer='adam', loss=loss)

# train the model
model.fit(...)

Once the network is trained, you can remove the crowd layer from the model, exposing the bottleneck layer, and using it to make predictions:

# remove crowd layer before making predictions
model.pop() 
model.compile(optimizer='adam', loss='categorical_crossentropy')

# make predictions
predictions = model.predict(...)

For details, kindly see the paper.

Regression

Using the crowd layer in your own Keras deep neural networks for regression problems is very similar to the classification case:

# build your own base model for regression
model = Sequential()
model.add(...) 
...

# add crowd layer on top of the base model
model.add(CrowdsRegression(N_ANNOT, conn_type="B"))

# instantiate specialized masked loss to handle missing answers
loss = MaskedMultiMSE().loss

# compile model with masked loss and train
model.compile(optimizer='adam', loss=loss)

# train the model
model.fit(...)

Once the network is trained, you can remove the crowd layer from the model, exposing the bottleneck layer, and using it to make predictions.

Sequence labelling

Using the crowd layer in your own Keras deep neural networks for sequence labelling problems is very similar to the classification case, but since the output are now sequences, you need to use the following loss function instead:

# instantiate specialized masked loss to handle missing answers
loss = MaskedMultiSequenceCrossEntropy(N_CLASSES).loss

Demos

For demonstration purposes, we provide 4 practical applications of the crowd layer in the following problems:

See the corresponding jupyter notebooks available on the repository.

Datasets

The datasets used in all the experiments from the demos are available here.