nodejs/node-report

Please provide a way to remove or filter printed env vars

hulkish opened this issue · 7 comments

For example, currently all env vars are reported:

================================================================================
==== System Information ========================================================

Environment variables
  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  SOME_API_KEY=abc123
  ...etc

Per this example, SOME_API_KEY is actually considered sensitive and we don't want it printed to console. I think there should be a way to hide these.

Relevant bit of code that currently reports the env vars:

#ifdef _WIN32
out << "\nEnvironment variables\n";
LPTSTR lpszVariable;
LPTCH lpvEnv;
// Get pointer to the environment block
lpvEnv = GetEnvironmentStrings();
if (lpvEnv != nullptr) {
// Variable strings are separated by null bytes, and the block is terminated by a null byte.
lpszVariable = reinterpret_cast<LPTSTR>(lpvEnv);
while (*lpszVariable) {
out << " " << lpszVariable << "\n", lpszVariable;
lpszVariable += lstrlen(lpszVariable) + 1;
}
FreeEnvironmentStrings(lpvEnv);
}
#else
out << "\nEnvironment variables\n";
int index = 1;
char* env_var = *environ;
while (env_var != nullptr) {
out << " " << env_var << "\n";
env_var = *(environ + index++);
}

Some things to discuss:

  • Should this be implemented as a blacklist (i.e. user lists the env vars to exclude) or a whitelist (i.e. user lists the env vars to include)?
  • Should the excluded env vars be omitted entirely, or included with the value replaced, e.g. SOME_API_KEY=[redacted]?

@richardlau how about we introduce 2 new env vars: NODE_REPORT_ENV_WHITELIST. NODE_REPORT_ENV_BLACKLIST.

I think we just provide both options like this. Then, we just indicate that whitelist behavior will override blacklist behavior.

Then, we just auto santize each "obscured" env var by logging it as MY_ENV_VAR=********. This way we're still serving the original purpose and.not polluting logs with sensitive data.

The blacklist I understand, and printing it with some kind of replaced value makes sense to me, but what's the use case for the whitelist?

@sam-github Well, if you consider the motive for this request - it's about security. How you decide on whitelist or blacklist depends on how important this is to your situation.

A blacklist is a strategy that says "allow everything - except these".

Whitelist is more confined & restrictive:

  • For example, if you're building a node service involving transactions - you probably don't want to print "all" env vars anyway.
  • You may feel it's more "safe" to only ever log the things you know for sure are safe for logs.
  • Maybe you only ever care about 2-3 env vars and don't want the noise of others?

I wasn't clear on how the white and black list interacted. As I understand you know, the black list will be completely ignored if a white list is present.

Points 1 and 3 above I don't think should be goals of node-report. node-report is verbose by design (I believe), it doesn't have a way to add/remove sections, or increase and decrease verbosity.

Point 2 I find mildly compelling. node-reports aren't public, people concerned about the report output can post-process them to remove anything they want somewhere between disk and whatever log storage they use. Still, white lists look easy to implement when only one of a white or black list has to be considered.

Are you interested in implementing this feature, or hoping someone else will?

@sem-GitHub i must admit, i am a green horn with c++ - however, if u can point me in right direction...id be happy to take a stab at it

Richard pointed at the relevant code above. I guess you'd have to change it to get the env vars you propose, and check if the env var that is about to be printed is white or black listed.