Matching a regular expression using a 1Dimensional CNN
Procedure
- Generating strings of length 15 over the alphabet a, b, c, d
- Labeling strings basing on matching a 5-element regular expression
- Balancing dataset of size 10000 so that approximately half of the dataset contains regex-matching parts.
- Preparing data for training using one-hot encoding
- Dividing dataset into training and testing parts.
- Implementing and training a model consisting of one convolutional layer with one filter followed by one fully-connected layer and train it to classify strings. After training, examining the values of the filter
- Implementing and training more complex models (more filters, layers) and analyze their performance on the prepared dataset.