Document coding style used by R Core
hturner opened this issue · 6 comments
Is your feature request related to a problem? Please describe.
Contributors should be aware of the style typically used by R Core, so that R Core can focus on the code and not style issues when reviewing patches. Questions about style have come up on the Slack and I have also given feedback about this when reviewing patches before submission to Bugzilla.
Describe the solution you'd like
A new chapter/section on style. There is a little about R coding standards in the R-internals manual, e.g. using an indentation of 4 spaces in R and C code. Other style points I'm not sure are documented anywhere but I think are generally followed:
- spaces after commas and either side of operators
-
<-
rather than=
for assignment -
"
rather than'
for quoting text -
TRUE
andFALSE
notT
andF
- lines no longer than 80 characters
We could work through https://style.tidyverse.org/ to check what is common to base R. @mmaechler has also given talks about style and may know if there is another reference we should use.
Describe alternatives you've considered
N/A
Additional context
N/A
Perhaps we can recommend use of lintr cc @MichaelChirico.
In all my talks on the issue, I had emphasized that humans can much better format code than algorithms.
(and sometimes I add .. at most half jokingly): If you don't want to get replaced by a robot, don't behave like one ..
more than happy to collaborate developing linters specific to base R.
it really is an unfortunate experience to have a patch delayed due to stylistic issues -- having linters in place to at least cover some "don'ts" while leaving the "dos" more up to human judgment is certainly doable.
@hturner Can we make a task list for this issue to keep track of things being committed in the new chapter and the things that still need to be added?
@itsdebartha I updated my comment with an initial checklist. Meanwhile I re-discovered this vignette: https://cran.r-project.org/web/packages/rockchalk/vignettes/Rstyle.pdf, which gives guidance on style based on studying the source code of R and CRAN packages, along with an SEA score ("Subjective and completely unscientific personal Estimate
of Agreement." 😄). The SEA score gives an indication of how consistently that style is used in the R codebase. Probably we should document/add to the linter styles with an SEA > 0.9 or thereabouts.
Since this issue was motivated by a question on the R Contributor Slack about indentation style for C, I have done a little more digging. The R coding standards recommend to set the c-default-style
in Emacs to "bsd", which implements the Allman style (https://www.gnu.org/software/emacs/manual/html_node/ccmode/Built_002din-Styles.html). However, this is not used consistently in the R code base, e.g.
(See Indentation Styles). So all we can say is that the C code in R follows the GNU coding standards for C code in that the open-brace that starts the body of a C function should be in column one.
Indentation in R code is similarly inconsistent. If we print()
/View()
an R function it is formatted in Stroustrup style, but I don't think that is commonly used by R developers - more in base R than elsewhere, but not enough to recommend it. I think K&R is more commonly used, often with the variation of having the opening brace for a function on the same line as function()
, so I would recommend that if anything.
Discussed a bit about this issue during the R Development Hackathon at RSECon24. I would be happy to contribute a chapter for this as discussed with @hturner and @SaranjeetKaur.