๐ docs: improve section on `private` steps
Opened this issue ยท 1 comments
One-liner
The guideline on private
steps is not be up-to-date. We should probably review and update it to match our current data flow.
Context & details
In #2947, we worked on improving sections in our docs talking about private steps. However, there are still some doubts on certain fragments, in particular those relating to metadata field isPrivate
and nonRedistributable
and whether datasets are published to GitHub. These were raised by Pablo in his revision (here, or here).
The current text in our docs were based on the following issue-closing comment in #2631 (comment):
Turns out this was a false alarm due to a misunderstanding of the meaning of something being "private".
In the ETL, private means that the general public cannot access those files, except when they are published as indicators in the
grapher://
step. At that stage, anything private should be marked asnonRedistributable
in the metadata.In Grapher, datasets marked as
!isPrivate && !nonRedistributable
are automatically re-published to Github. If something is!nonRedistributable
, it means CSV download is available with Grapher.This means
!isPrivate
should probably be renamedpublishToGithub
, and it should befalse
any timenonDistributable
istrue
.Originally posted by @larsyencken in #2631 (comment)
TODO
- Public relies on a private snapshot?
- Snapshot has the flag, but doesn't have the prefix in the DAG.
non_redistributable
vs. private (create a 2x2 matrix)- true true: all private, makes sense
- true false: private in Grapher, public in ETL. doesnt make sense?
- false false: all public, makes sense
- false true: we only allow for a slice of the data to be downloadable, makes sense
- making data private in staging. does it make sense?
isPrivate
flag in Grapherdataset
table.- Use Enum instead of boolean flags (
'fully public'
,'partly private'
, etc.) - Tooling
public
โprivate
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.