Not including first rows of dataset, row shifting, incorrectly annotating row as left-hand labels, whereas labels on the right are correct

Question

Not including first rows of dataset, row shifting, incorrectly annotating row as left-hand labels, whereas labels on the right are correct

Closed this issue a year ago · 4 comments

import forestplot as fp
import pandas as pd

df = pd.read_csv("review_example.csv",sep=";")  # companion example data


fp.forestplot(df,  # the dataframe with results data
              estimate='PCSA_Men_mean',  # col containing estimated effect size 
              ll= 'PCSA_Men_Lower', hl='PCSA_Men_Upper',  # columns containing conf. int. lower and higher limits
              varlabel='Abbreviation',  # column containing variable label
              capitalize="capitalize",  # Capitalize labels
              annote=["Source", "Image modality", 'Sample_size',"Method", 'Position'],   # columns to report on left of plot
              annoteheaders=["Ref", "Modality", 'N',"PCSA", 'Pose'],  # ^corresponding headers
              rightannote=['Age', 'Height', 'Weight', 'Fiber_length', 'Pennation', "Info"],  # columns to report on right of plot 
              right_annoteheaders=['Age[y]', 'Height[cm]', 'Weight[kg]', 'Fiber_length[cm]', 'Pennation[Deg]', "Note"],  #corresponding headers
              
              groupvar= "Agegroup",  # column containing group labels
              group_order=["Reference","Young Adults","Adults"], 
              xlabel="PCSA Ratio",  # x-label title
              xticks=[0,30,60],  # x-ticks to be printed
              table=True,  # Format as a table
              color_alt_rows=True,  # Gray alternate rows
              # Additional kwargs for customizations
              **{"marker": "D",  # set maker symbol as diamond
                 "markersize": 35,  # adjust marker size
                 "xtick_size": 12,  # adjust x-ticker fontsize
                })
#plt.savefig("plot.jpg", bbox_inches="tight")

Answer 1 · 2023-09-13T13:18:00.000Z

The example code with the sleep dataset worked perfectly, however when I implemented my own dataset various mistakes arose. I hope someone has a solution for this?

Answer 2 · 2023-12-19T03:48:37.000Z

hi @rmaarle, thanks for raising this. I wasn't aware that duplicated variable labels (varlabel) would create problems, which is likely the source of the problem. If you use some other unduplicated label, things should work as expected.

Minimal example:

import forestplot as fp
import pandas as pd

df = pd.read_csv("review_example.csv",sep=";")  # companion example data
df = df.reset_index().astype({"index": str})

fp.forestplot(df,  # the dataframe with results data
              estimate='PCSA_Men_mean',  # col containing estimated effect size 
              ll= 'PCSA_Men_Lower', hl='PCSA_Men_Upper',  # columns containing conf. int. lower and higher limits
              varlabel="index",
)

Your case (main change is varlabel=index):

import forestplot as fp
import pandas as pd

df = pd.read_csv("review_example.csv",sep=";")  # companion example data
df = df.reset_index().astype({"index": str})

fp.forestplot(df,  # the dataframe with results data
              estimate='PCSA_Men_mean',  # col containing estimated effect size 
              ll= 'PCSA_Men_Lower', hl='PCSA_Men_Upper',  # columns containing conf. int. lower and higher limits
              varlabel='index',  # column containing variable label
              capitalize="capitalize",  # Capitalize labels
              annote=["Source", "Image modality", 'Sample_size',"Method", 'Position'],   # columns to report on left of plot
              annoteheaders=["Ref", "Modality", 'N',"PCSA", 'Pose'],  # ^corresponding headers
              rightannote=['Age', 'Height', 'Weight', 'Fiber_length', 'Pennation',],  # columns to report on right of plot 
              right_annoteheaders=['Age[y]', 'Height[cm]', 'Weight[kg]', 'Fiber_length[cm]', 'Pennation[Deg]'],  #corresponding headers
              
              groupvar= "Agegroup",  # column containing group labels
              group_order=["Reference","Young Adults","Adults"], 
              xlabel="PCSA Ratio",  # x-label title
              xticks=[0,30,60],  # x-ticks to be printed
              table=True,  # Format as a table
              color_alt_rows=True,  # Gray alternate rows
              # Additional kwargs for customizations
              **{"marker": "D",  # set maker symbol as diamond
                 "markersize": 35,  # adjust marker size
                 "xtick_size": 12,  # adjust x-ticker fontsize
                }
)

Answer 3 · 2023-12-19T03:50:21.000Z

Your use case may find the future release (WIP) with grouped labels useful. The duplicated variable labels you were using were really groups. See #59 for an example.

Answer 4 · 2023-12-19T03:51:11.000Z

The next release will also warn about duplicated labels in the readme.