PanDAWMS/dkb

README: common style.

Closed this issue · 5 comments

What is README file: any file with uppercase name README, wherewer it was placed within the repository.

Issue: all the README files were written by different people without second thought about any common style. But common style is good: makes things easier to read.

Suggestion: adopt style, suggested in #194:

  • no tabs, only spaces;
  • no trailing spaces (fair for any other file as well, by the way);
  • text wrap at 80 characters (except for lines containing looong links or something else, that can not be wrapped naturally).

How to: just as with PEP-8 codestyle, it would be easier to check things automatically then try to control manually. If we accept the suggested style, it won`t worth any discussion during the code review process, so better to keep it somewhere at the "background".

As I understood, it is possiply to apply the check only to the files (and even lines) changed in given push or PR. If so, it looks fine: will make Travis checks faster and won`t force the first person, who risk to make a commit after the checks turned on, to reformat all the READMEs in the repository.

Any comments/suggestions are welcome.

(NOTE: for the comments related to the automatic checks please use #268).


Additional conventional rules (not to be checked automatically):

  • title header should look like:
    =============
    * Stage 016 *
    =============
    
  • section headers should look like:
    1. Description
    --------------
    

I agree with the outline of the problem and suggested actions. Several additional things, in my opinion, should be discussed - however, I'm not sure that it would be possible to enforce the rules for these issues via Travis, so we'll probably have to do it "by hand".

The style of headings

=============
* Stage 016 *
=============

1. Description
--------------

vs

PDF Analyzer

Introduction

vs

Step 055_documents2TTL has the following functionality:

I suggest adopting the first one. Among other things, it is easier to navigate and is used in #194.

Wording

"Obtain datasets' metadata from Rucio ..."

vs

"PDF Analyzer is intended for extracting certain data from PDF files ..."

vs

"Checks that the given data is present in ElasticSearch ..."

The first form, "Do X", is used for commits, but isn't necessarily the best choice in here.

P.S. Given examples are not complete, and other forms may be present in the repository.
P.P.S. There are other problems with READMEs, such as missing sections, sections that should be removed, or READMEs missing altogether, but these are about content and not style, so they are out of the scope of this issue.

@Evildoor,
added the part about headers to the issue description as "additional rules", please check if I interpreted your idea correctly.

Talking of the wording, I don`t know. It is more like a regular text, not even docstrings -- which also require "Do X" form and from which, I believe, that README must have taken its style ;) And it does not look exactly perfect, indeed; so the only suggestion I can come up with here sounds like "speak normally". The closest variant, I think, is the second one:

"PDF Analyzer is intended for extracting certain data from PDF files ..."

-- but I don`t know how to express the idea, as what is "normal" is different for everyone.

Concerning the content -- or, better to say, the form -- it would be good to formulate here some recommendations like:

  • what sections should be presented in a stage README (e.g. "Description", "Requirements", "External resources", ...);
  • what information this or that section should provide.

It will help the one who will decide to bring some order here, and provide some guideline for new files.

@mgolosova

added the part about headers to the issue description as "additional rules", please check if I interpreted your idea correctly.

This looks fine if we are talking specifically about stages, but incorrect regarding READMEs in general - dataflow README in #194 has several "title" headers, including the actual title one. I think that "title header" and "section header" should be replaced with something like "major header" and "minor header".

Another issue with the dataflow README is that it uses the same (major header) style for title, sections and subsections. Do we need a third/fourth type of headers? One option is to define the third one as the same as major header, but with text in caps (see "REFERENCES" in dataflow README). Note - dataflow README will probably require changes if we do something here.

wording
the only suggestion I can come up with here sounds like "speak normally"
...
but I don`t know how to express the idea, as what is "normal" is different for everyone.

How about something like:

===========
* Stage X *
===========

1. Description
--------------

Stage X/this stage/this module/... does Y and Z.

as a template? Some freedom is allowed, but "Do Y" or "(stage) Does Y" are not.

Discussion moved to Trello: https://trello.com/c/xVXzf4qQ