Horizontal percentages for factors and p-values
Closed this issue · 5 comments
Hello Benjamin,
I’d like to ask you, how can I get horizontal percentages for factors with your package table1 and also a column of p-values. In some observational studies these can be quite usefull. I was using Hmis but it doesn't let me to have mean and medians in two lines for numeric variables as table1 does.
Thank you very much,
Peter
Hi Peter,
The package wasn't designed with this use case in mind, but it is possible to use a trick (you might call it a "hack") to get the percentages along rows rather than columns.
First, here is some sample data:
library(table1)
set.seed(123)
n <- 100
dat <- data.frame(x=sample(LETTERS[1:3], n, replace=T), y=sample(LETTERS[4:5], n, replace=T))
dat$x <- factor(dat$x)
dat$y <- factor(dat$y)
dat$z <- runif(n, 10, 20)
table(dat$x, dat$y)
##
## D E
## A 16 17
## B 11 21
## C 17 18
Now, we create a custom render function, as follows:
rndr <- function(x, name, ...) {
if (is.numeric(x)) {
render.default(x, name, ...)
} else {
if (length(x) == nrow(dat)) {
ret <- c("", sprintf("%s (100%%)", table(x)))
} else {
cnt <- table(dat[[name]], dat$y)
pct <- 100*prop.table(cnt, 1)
i <- min(which(apply(sweep(cnt, 1, table(x)) == 0, 2, all)))
ret <- c("", sprintf("%s (%s%%)", cnt[,i], round_pad(pct[,i], 1)))
}
names(ret) <- c("", names(table(x)))
ret
}
}
The result:
table1(~ x + z | y, data=dat, render=rndr)
D(N=44) | E(N=56) | Overall(N=100) | |
---|---|---|---|
x | |||
A | 16 (48.5%) | 17 (51.5%) | 33 (100%) |
B | 11 (34.4%) | 21 (65.6%) | 32 (100%) |
C | 17 (48.6%) | 18 (51.4%) | 35 (100%) |
z | |||
Mean (SD) | 14.9 (3.05) | 14.7 (2.90) | 14.8 (2.95) |
Median [Min, Max] | 14.9 [10.1, 19.8] | 14.7 [10.1, 19.7] | 14.7 [10.1, 19.8] |
For the p-value column I don't have a general solution at the moment, unfortunately. This is discussed in the vignette.
EDIT: fixed code to handle edge case.
The problem with that is that the overall should show the vertical percentage
That's actually not more difficult. Here is the code:
rndr <- function(x, name, ...) {
if (is.numeric(x) || length(x) == nrow(dat)) {
render.default(x, name, ...)
} else {
cnt <- table(dat[[name]], dat$y)
pct <- 100*prop.table(cnt, 1)
i <- min(which(apply(sweep(cnt, 1, table(x)) == 0, 2, all)))
ret <- c("", sprintf("%s (%s%%)", cnt[,i], round_pad(pct[,i], 1)))
names(ret) <- c("", names(table(x)))
ret
}
}
table1(~ x + z | y, data=dat, render=rndr)
D(N=44) | E(N=56) | Overall(N=100) | |
---|---|---|---|
x | |||
A | 16 (48.5%) | 17 (51.5%) | 33 (33.0%) |
B | 11 (34.4%) | 21 (65.6%) | 32 (32.0%) |
C | 17 (48.6%) | 18 (51.4%) | 35 (35.0%) |
z | |||
Mean (SD) | 14.9 (3.05) | 14.7 (2.90) | 14.8 (2.95) |
Median [Min, Max] | 14.9 [10.1, 19.8] | 14.7 [10.1, 19.7] | 14.7 [10.1, 19.8] |
EDIT: fixed code to handle edge case.
Thank you very much Benjamin! you were so kind.
Hi Benjamin,
Thank you for the codes above. Do you mind editing the code one more time to show row wise percentage of of missing. Currently showing valid percentages only.
Thanks in advance