Base R plot support
kylebutts opened this issue · 10 comments
@grantmcdermott I'm wondering if I could pick your brain about base R plot support (or more likely, using plot2
). I've added support for fixest_multi
usage (multiple y variables and/or split
/fsplit
samples). Now you can replace feols
with fwlplot
and it will just work :-)
I have the following dataset ready to plot from previous calls (see bottom for example of what the dataset would look like). If there is no split
/fsplit
, then sample is a character column of empty strings ""
.
For plotting, if there is both I do facet_grid(rows = var, cols = sample)
if the estimation is split
and multiple y variables. If there is only 1 of split or multiple y, then I use facet_wrap
. Otherwise, what we have.
I saw your most recent comment here: grantmcdermott/tinyplot#76, and was wondering if this is something that might be easily possible in plot2
?
sample idx var y_resid x_resid fit lwr upr
<char> <int> <char> <num> <num> <num> <num> <num>
7: cyl = 4 9 disp 2.291529e+01 4.627560e-01 2.286022e+01 1.229056e+01 3.342988e+01
8: cyl = 4 8 disp 5.531900e+01 1.039647e+00 5.135873e+01 3.308987e+01 6.962759e+01
9: cyl = 6 1 disp 8.526513e-12 -1.275000e-01 8.540724e-12 8.465837e-12 8.615610e-12
10: cyl = 6 10 disp 8.554935e-12 2.042810e-14 8.540724e-12 8.497488e-12 8.583959e-12
11: cyl = 6 11 disp 8.554935e-12 2.042810e-14 8.540724e-12 8.497488e-12 8.583959e-12
12: cyl = 6 2 disp 8.526513e-12 1.275000e-01 8.540724e-12 8.465837e-12 8.615610e-12
13: Full sample 1 disp 1.202142e+01 -3.645965e-01 -1.538589e+01 -2.786976e+01 -2.902013e+00
23: Full sample 9 disp 1.108624e+01 4.346205e-01 1.834088e+01 4.748985e+00 3.193278e+01
24: Full sample 8 disp 5.716882e+01 1.066898e+00 4.502284e+01 1.904455e+01 7.100112e+01
25: cyl = 4 19 mpg -8.817797e-01 -3.726589e-01 1.349912e+00 -1.675924e+00 4.375747e+00
31: cyl = 4 9 mpg -6.758828e-01 4.627560e-01 -1.676277e+00 -5.002769e+00 1.650214e+00
32: cyl = 4 8 mpg -5.066455e+00 1.039647e+00 -3.765994e+00 -9.515585e+00 1.983596e+00
33: cyl = 6 10 mpg 7.000000e-01 2.042810e-14 1.477903e-12 -8.894343e+00 8.894343e+00
34: cyl = 6 11 mpg -7.000000e-01 2.042810e-14 1.477903e-12 -8.894343e+00 8.894343e+00
35: cyl = 6 2 mpg 8.668621e-13 1.275000e-01 8.668021e-13 -1.257850e+01 1.257850e+01
36: Full sample 3 mpg -1.091027e+00 -3.594840e-01 1.343803e+00 -8.147531e-01 3.502358e+00
37: Full sample 20 mpg 4.979079e+00 -3.419456e-01 1.283776e+00 -8.296313e-01 3.397184e+00
sample idx var y_resid x_resid fit lwr upr
Hi Kyle.
The tl;dr read is that, yes, facets are on the roadmap for plot2
. I haven't written down the formal implementation, but I believe it should work following the basic format that I outlined in the comment you linked too. TBH I hadn't considered facet_grid support yet and was mostly thinking i.t.o. facet_wrap... But I guess both should be possible.
One UI thing I'd like feedback on is potential formula method support and implementation. My current thinking is that we could denote facets after a second vertical bar, e.g. y ~ x | colour_var | facet_var
. You could expand to grids via a two-part formula, y ~ x | colour_var | grid_var1 ~ grid_var2
.
I can't make any promises RE timeline. But my goal is to finish this type of feature support in time for a CRAN submission before 2023 is out. I may go ahead and merge the current PR since that's the current bottleneck for further development rn.
OTOH if you're in a rush, you may only need plot2 if you are worried about a nice legend (potentially irrelevant here?). Quick mockup below using your dataset. I'm still using plot2 here, but only for the ribbon part since I'm lazy I don't feel like going the manual polygon route.
dat = data.table::fread('~/Desktop/kyle.csv')
sdat = split(dat, list(dat$var, dat$sample))
par(mfcol = c(length(unique(dat$var)), length(unique(dat$sample))))
invisible(lapply(
seq_along(sdat),
\(i) {
# plot points
with(
sdat[[i]],
plot(
x = x_resid, y = y_resid,
pch = 16,
xlim = range(dat$x_resid), ylim = range(c(dat$upr, dat$lwr)),
frame.plot = FALSE, axes = FALSE, ann = FALSE
)
)
# add grid
grid()
# facet details for some label calcs
facets = par("mfcol")
n_facets = Reduce(`*`, par("mfcol"))
# x axis
if (i %% facets[1] == 0) {
axis(1, lty = 0)
title(xlab = "Residualized x")
}
# y axis
if (i %in% 1:facets[1]) {
axis(2, lty = 0, las = 1)
title(ylab = "Residualized y")
}
# 1st facet labels
if (i %% facets[1] == 1) {
mtext(sub(".*\\.", "", names(sdat))[i], side = 3)
}
# 2 facet labels
if (i %in% c(n_facets, n_facets-1)) {
mtext(sub("\\..*", "", names(sdat))[i], side = 4, crt = 180)
}
# add best fit ribbon (here, using plot2)
with(
sdat[[i]],
plot2::plot2(
x = x_resid, y = fit, ymin = lwr, ymax = upr,
type = "ribbon", col = "darkgreen",
add = TRUE
)
)
}
))
Created on 2023-10-12 with reprex v2.0.2
This is really great Grant! I didn't know about par(mfcol)
. Really should dive into base R plotting at some point...
Just a HU that plot2
v0.0.4 (just released on R-universe) includes full-featured facet support. One nice thing compared to your current bespoke solution is that it correctly spaces and scales the axes titles relative to the individual plot windows. So you don't end up with the left-most plots being "thinner" as a result of axes printing.
library(plot2)
# resids dataset obtained by running debugonce(fwlplot:::plot_resids_base_r)
# and writing to disk
resids = read.csv('~/Desktop/resids.csv')
resids |>
with(plot2(
x = x_resid, y = y_resid,
facet = sample ~ var,
facet.args = list(bg = "grey90"),
pch = 21, fill = adjustcolor(1, alpha = 0.1),
frame = FALSE, grid = TRUE
))
resids |>
with(plot2(
x = x_resid, y = fit, ymin = lwr, ymax = upr,
type = "ribbon", col = "dodgerblue",
facet = sample ~ var,
add = TRUE
))
Created on 2024-01-26 with reprex v2.0.2
(Obvs looks a bit silly here b/c of reprex
's squashed plot aspect ratio. But you can see that things at least look proportional.)
We could probably port that logic here, or I can flag once plot2 hits CRAN.
Thanks for the heads up. This is great. Since there's going to be basically 0 more features added on this, I'll port over to plot2's feature and then push to CRAN when you do.
FYI, with reprex you can use # %%
/#'
since it uses knitr::spin
under the hood. e.g.
library(fixest)
library(plot2)
aq = airquality
aq$Month = factor(month.abb[aq$Month], levels = month.abb[5:9])
mod = lm(Temp ~ 0 + Month / Day, data = aq)
aq = cbind(aq, predict(mod, interval = "confidence"))
# %% fig.width = 8, fig.height = 6, dpi = 300
# Plot the original points
with(
aq,
plot2(
x = Day, y = Temp,
by = Month,
facet = "by", facet.args = list(bg = "grey90"),
palette = "dark2",
grid = TRUE, frame = FALSE, ylim = c(50, 100),
main = "Actual and predicted air temperatures"
)
)
yields
library(fixest)
library(plot2)
#> Warning: package 'plot2' was built under R version 4.3.2
aq = airquality
aq$Month = factor(month.abb[aq$Month], levels = month.abb[5:9])
mod = lm(Temp ~ 0 + Month / Day, data = aq)
aq = cbind(aq, predict(mod, interval = "confidence"))
# Plot the original points
with(
aq,
plot2(
x = Day, y = Temp,
by = Month,
facet = "by", facet.args = list(bg = "grey90"),
palette = "dark2",
grid = TRUE, frame = FALSE, ylim = c(50, 100),
main = "Actual and predicted air temperatures"
)
)
# Add the model predictions to the same plot
with(
aq,
plot2(
x = Day, y = fit,
ymin = lwr, ymax = upr,
by = Month, facet = "by",
type = "ribbon",
palette = "dark2",
add = TRUE
)
)
Created on 2024-01-26 with reprex v2.1.0
Made the switch! I'll try to keep an eye out when tinyplot
gets put on CRAN @grantmcdermott.