Keyword plugin not detecting "password" keyword?
howard-adam opened this issue · 2 comments
-
I'm submitting a ...
- [X ] bug report
- feature request
-
What is the current behavior?
"detect-secrets scan" misses files in current directory or subdirectories containing "password" keyword -
If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem
in git repository, create a file ATEST.cs with just two lines:
Get Connection String
String oraConnStr = "Data Source=sis2:1521/dprod.sis2.domain.org;User ID=svcacct;Password=svcacctpass;"; -
What is the expected behavior?
detect-secrets scan will identify this file as containing secrets according to keyword plugin -
What is the motivation / use case for changing the behavior?
-
Please tell us about your environment:
- detect-secrets Version: 1.4.0
- Python Version: 3.11.3
- OS Version: Windows 10.0.09045.2965
- File type (if applicable): text file
-
Other information
I've tried running detect-secrets via pre-commit hook or straight from command line. When I tried on real-world files, the Base64HighEntropyString did have some false-positive results, but none of the files containing the keyword "Password" are being detected.
It won't detect password unless it's in complete quotes:
for ex:
password=hjsadni221" won't be detected
password= "ansdna" will be detected
The tool fails to detect passwords unless they are enclosed in quotes. For example:
password=hjsadni221
is not detected,- but
password="ansdna"
is detected.
Hi @geekNero, @howard-adam,
I recently encountered an issue similar to the one described above. When a password is not enclosed in quotes, it is not detected as a secret. Here is an example:
PASSWORD=aijewga@#!%^ahgfdndbks211
This password is not detected, as shown in the following images:
However, when I add quotes around the password:
PASSWORD="aijewga@#!%^ahgfdndbks211"
The tool successfully detects it as a secret:
I have reviewed the relevant keyword_test.py test file from the Yelp/detect-secrets
repository. I noticed that some test cases do not cover scenarios where the password is not enclosed in quotes:
Additionally, after looking into the source code of keyword.py, I found that the implementations were based on some heuristic approaches. While the design makes sense in some contexts, I am wondering if we could improve this behavior. Will it be more helpful for password detection or bring up more false positives?
Could you share your thoughts on this? I am interested in resolving this problem. Thanks!