biobakery/Maaslin2

"Please provide the reference for the variable" error when running Maaslin2

polinanvkv opened this issue · 1 comments

Hello!

I am trying to run Maaslin2 with the code:

input_data = read.table(file = "4Masslin2_input.data_kos.taxonomy.archaea.mt.2group.tsv",
                        header = TRUE, sep = "\t")
rownames(input_data) <- input_data$Geneid_ord
input_data$Geneid_ord = NULL

metadata = read.table(file = "4Masslin2_metadata_kos.taxonomy.archaea.mt.2group.tsv",
                      header = TRUE, sep = "\t")
rownames(metadata) <- metadata$Geneid_ord
metadata$Geneid_ord = NULL

# Create the 'Ctrl' column
metadata$Ctrl <- ifelse(metadata$Diagnosis == "Ctrl", "Yes", "No")

# Create the 'PD' column
metadata$PD <- ifelse(metadata$Diagnosis == "PD", "Yes", "No")

# Create the 'iRBD' column
metadata$iRBD <- ifelse(metadata$Diagnosis == "iRBD", "Yes", "No")

reference <- unique(metadata$S)
reference <- c("Methanobrevibacter_A smithii","Methanobrevibacter_A smithii_A","Methanosphaera stadtmanae","Methanomethylophilus alvus","DTU008 sp001421185","Methanomassiliicoccus luminyensis","MX-02 sp006954405","Coprobacillus cateniformis","Methanobrevibacter_C arboriphilus_A","Methanosphaera cuniculi")

Maaslin2(input_data = input_data,
         input_metadata = metadata,
         fixed_effects = c("Ctrl", "PD", "iRBD", "S"),
         reference = reference,
         min_prevalence = 0,
         output = "test",
         transform = "LOG",
         plot_heatmap = TRUE,
         plot_scatter = TRUE,
         heatmap_first_n = 50,
         max_significance = 1)

Examples of my metadata and input data are below:

metadata:

         Diagnosis       D                 P               C                       O                       F                    G
K00053_1      Ctrl Archaea Methanobacteriota Methanobacteria      Methanobacteriales     Methanobacteriaceae Methanobrevibacter_A
K00053_2      Ctrl Archaea Methanobacteriota Methanobacteria      Methanobacteriales     Methanobacteriaceae Methanobrevibacter_A
K00053_3      Ctrl Archaea Methanobacteriota Methanobacteria      Methanobacteriales     Methanobacteriaceae       Methanosphaera
K00053_4      Ctrl Archaea  Thermoplasmatota  Thermoplasmata Methanomassiliicoccales Methanomethylophilaceae Methanomethylophilus
K00053_5        PD Archaea Methanobacteriota Methanobacteria      Methanobacteriales     Methanobacteriaceae Methanobrevibacter_A
K00053_6        PD Archaea Methanobacteriota Methanobacteria      Methanobacteriales     Methanobacteriaceae Methanobrevibacter_A
                                      S Ctrl  PD iRBD
K00053_1   Methanobrevibacter_A smithii  Yes  No   No
K00053_2 Methanobrevibacter_A smithii_A  Yes  No   No
K00053_3      Methanosphaera stadtmanae  Yes  No   No
K00053_4     Methanomethylophilus alvus  Yes  No   No
K00053_5   Methanobrevibacter_A smithii   No Yes   No
K00053_6 Methanobrevibacter_A smithii_A   No Yes   No

input_data:

                tpm
K00053_1 166.502489
K00053_2 188.409788
K00053_3  69.970092
K00053_4   2.219452
K00053_5 642.522944
K00053_6 136.308126

As a result I receive an error:

2023-05-11 17:25:04 INFO::Writing function arguments to log file
2023-05-11 17:25:04 INFO::Verifying options selected are valid
2023-05-11 17:25:04 INFO::Determining format of input files
2023-05-11 17:25:04 INFO::Input format is data samples as rows and metadata samples as rows
2023-05-11 17:25:04 INFO::Formula for fixed effects: expr ~  Ctrl + PD + iRBD + S
Error in Maaslin2(input_data = input_data, input_metadata = metadata,  : 
  Please provide the reference for the variable 'S' which includes more than 2 levels: Methanobrevibacter_A smithii, Methanobrevibacter_A smithii_A, Methanosphaera stadtmanae, Methanomethylophilus alvus, Methanomassiliicoccus_A intestinalis, UBA71 sp905187815, DTU008 sp001421185, Methanomassiliicoccus luminyensis, MX-02 sp006954405, Coprobacillus cateniformis, Methanobrevibacter_C arboriphilus_A, Methanosphaera cuniculi, Methanobrevibacter ruminantium_A.

Could you please suggest a solution to the error and probably the source of it?

Thank you for creating this issue.
We currently field issues through our bioBakery Discourse Support Forum.
If you would please post the issue to discourse we would be happy to sync up with you to get it resolved.