[BUG] using PCA twice in pipeline fails
Closed this issue · 1 comments
kaurao commented
Describe the bug
Adding two pca
steps to a pipeline fails
To Reproduce
creator = PipelineCreator()
creator.add('pca', apply_to='pca1', n_components=1)
creator.add('pca', apply_to='pca2', n_components=1)
creator.add('ridge', apply_to=['continuous', 'categorical'], problem_type='regression')
Expected behavior
The features from PCA should be used as new features.
System (please complete the following information):
- OS: Windows WSL
- Linux bnbnbkpatil 5.15.68.1-microsoft-standard-WSL2 #1 SMP Mon Sep 19 19:14:52 UTC 2022 x86_64 GNU/Linux
samihamdan commented
I have tried to replicate this on the newest version of #183 with Xtypes of pca1 and pca2 then it works.
So I assume we resolved this in the meantime.
This is the code I used:
from julearn.pipeline import PipelineCreator
from julearn import run_cross_validation
from seaborn import load_dataset
df = load_dataset("iris")
X = list(df.iloc[:, :-1].columns)
y = "species"
creator = PipelineCreator(problem_type="classification")
creator.add("pca", apply_to="pca1")
creator.add("pca", apply_to="pca2")
creator.add("svm")
run_cross_validation(
X=X, y=y, data=df, model=creator,
X_types={"pca1": X[:2], "pca2": X[2:]}
)
If your example still does not work please provide a toy dataset to execute your code.