[BUG] using PCA twice in pipeline fails

Question

[BUG] using PCA twice in pipeline fails

Closed this issue 2 years ago · 1 comments

Describe the bug
Adding two pca steps to a pipeline fails

To Reproduce

creator = PipelineCreator()
creator.add('pca', apply_to='pca1', n_components=1)
creator.add('pca', apply_to='pca2', n_components=1)
creator.add('ridge', apply_to=['continuous', 'categorical'], problem_type='regression')

Expected behavior
The features from PCA should be used as new features.

Screenshots

System (please complete the following information):

OS: Windows WSL
Linux bnbnbkpatil 5.15.68.1-microsoft-standard-WSL2 #1 SMP Mon Sep 19 19:14:52 UTC 2022 x86_64 GNU/Linux

Answer 1 · 2023-04-06T08:07:56.000Z

I have tried to replicate this on the newest version of #183 with Xtypes of pca1 and pca2 then it works.
So I assume we resolved this in the meantime.

This is the code I used:

from julearn.pipeline import PipelineCreator
from julearn import run_cross_validation
from seaborn import load_dataset

df = load_dataset("iris")
X = list(df.iloc[:, :-1].columns)
y = "species"

creator = PipelineCreator(problem_type="classification")
creator.add("pca", apply_to="pca1")
creator.add("pca", apply_to="pca2")
creator.add("svm")

run_cross_validation(
    X=X, y=y, data=df, model=creator,
    X_types={"pca1": X[:2], "pca2": X[2:]}
)

If your example still does not work please provide a toy dataset to execute your code.