Based on baseline logs, LogJuicer highlights useful texts in target logs. The goal is to save time in finding failures' root causes.
LogJuicer implements a custom diffing process to compare logs:
- A tokenizer removes random words.
- Lines are converted into feature vectors using the hashing trick.
- The logs are compared using cosine similarity.
LogJuicer features a discovery mechanism to automatically find the source of the diff for some targets (called baselines):
- A
service.log
file will be compared with the lastservice.log-YYYYDDMM
. - CI builds baselines may be found through the external service API.
When the baseline discovery fails, the diff's source must be provided.
Install the logjuicer
command line by running:
$ cargo install --locked --git https://github.com/logjuicer/logjuicer logjuicer-cli
If you don't have
cargo
, see this install rust documentation.
Or grab the latest release assets logjuicer-x86_64-linux.tar.bz2
from https://github.com/logjuicer/logjuicer/releases
Analyze a local file:
$ logjuicer path /var/log/zuul/scheduler.log
Analyze a remote url:
$ logjuicer url https://zuul/build/uuid
Compare two inputs (when baseline discovery doesn't work):
$ logjuicer diff https://zuul/build/success-build https://zuul/build/failed-build
Save and re-use trained model using the --model file-path
argument.
LogJuicer can create a static report for archival purpose using the --report
argument:
.bin
or.gz
files are created along with a.html
viewer to be displayed in a web browser. Add the--open
argument to load the report with xdg-open..json
are regular json export.
For example, run the following command to visualize the differences between two directories:
$ logjuicer --open --report report-case-01.bin.gz diff sosreport-success/ sosreport-failled/
LogJuicer supports the ant's fileset configuration to filter the processed files:
- includes: list of files regex that must be included. Defaults to all files.
- excludes: list of files regex that must be excluded. Defaults to default excludes or none if
default_excludes
is false. - default_excludes: indicates whether default excludes should be used or not.
LogJuicer supports custom ignore patterns to silence noisy lines:
- ignore_patterns: list of log line regex to be ignored.
Adds custom extra baseline, for example to include files that are skipped in success build artifacts:
- extra_baselines: list of file path or remote urls.
The configuration can be defined per target, for example:
- job_matcher: "^my-job[0-9]+$"
config:
excludes: [big_file]
ignore_patterns:
- get logs
- fetch debug
Use the logjuicer debug-config JOB FILE LINE
to validate the ignore_patterns config.
To read more about the project:
- Initial presentation blog post
- The command line specification: ./doc/adr/0001-architecture-cli.md
- How the tokenizer works: Improving LogJuicer Tokenizer
- How the nearest neighbor works: Implementing LogJuicer Nearest Neighbors
- How the log file iterator works: Introducing the BytesLines Iterator
- Completing the first release of LogJuicer
- How the web interface works: WASM based web interface
- The report file format: Leveraging Cap'n Proto For LogJuicer Reports
Clone the project and run tests:
git clone https://github.com/logjuicer/logjuicer && cd logjuicer
cargo test && cargo fmt && cargo clippy
Run the project:
cargo run -p logjuicer-cli -- --help
Activate tracing debug:
export LOGJUICER_LOG="logjuicer_model=debug,logjuicer_cli=debug"
# Create a chrome trace that can be viewed in web browser with `chrome://tracing`
export LOGJUICER_TRACE=./chrome.trace
Checkout the web crate to develop the web interface.
Join the project Matrix room: #logjuicer:matrix.org.
- Detect
jenkins
url - Reports minification
- Web service deployment