Error while using column names with '+' eg: 'sepal+length'
Closed this issue · 1 comments
Describe the bug
Error in run_cross_validation while using column names with '+' eg: 'sepal+length'
To Reproduce
This code will reproduce the error
import pandas as pd
import numpy as np
from seaborn import load_dataset
from julearn import run_cross_validation
from julearn.utils import configure_logging
df_iris = load_dataset('iris')
df_iris = df_iris.rename(columns={'sepal_length': 'sepal+length'})
# replaced underscore with '+'
X = ['sepal+length', 'sepal_width', 'petal_length']
y = 'species'
scores, model_iris = run_cross_validation(X=X, y=y, data=df_iris, model='svm', preprocess_X='zscore', problem_type='multiclass_classification', scoring=['accuracy'], return_estimator='final')
Expected behavior
It should have run without error
Screenshots
If applicable, add screenshots to help explain your problem.
System (please complete the following information):
- OS: macOS and Linux
- Version [e.g. 22] julearn: 0.2.5.dev
Additional context
Add any other context about the problem here.
Using 'sepal+length'
as a column name will be considered as a regular expression. You need to escape the +
in order to use the literal element in a regular expression.
Your X should be like this:
X = ['sepal\+length', 'sepal_width', 'petal_length']