allinurl/goaccess

Caddy with custom logs not accepted

Closed this issue · 5 comments

I have a log problem that I can't figure out the solution for. I managed to minimize it with the attached log-files.

error.log: is not parsed by goaccess and gives this error in the log:

==1== GoAccess - version 1.9.4 - Apr 18 2025 23:18:58
==1== Config file: /etc/goaccess/goaccess.conf
==1== https://goaccess.io - <hello@goaccess.io>
==1== Released under the MIT License.
==1==
==1== FILE: /caddylog/error.log
==1== Parsed 2 lines producing the following errors:
==1==
==1== Token '' doesn't match specifier '%s'
==1== Token '' doesn't match specifier '%s'
==1==
==1== Format Errors - Verify your log/date/time format
==1== Use --invalid-requests option to store such lines in a file.

1ok2ignore.log: is accepted but in the output I only see the request that is in this file and not in the other, meaning that the two other files are still ignored.

WebSocket server ready to accept new client connections

So there must be something wrong with my custom log format, but I can't figure out what. Here it is:

# Log parsing
date-format %Y/%m/%d
time-format %H:%M:%S
log-format %d\ %t.%^%^%^{\"request\":\ {\"remote_ip\":\ \"%^\",\ \"client_ip\":\ \"%h\",\ \"proto\":\ \"%H\",\ \"method\":\ \"%m\",\ \"host\":\ \"%v\",\ "uri":\ \"%U\"},\ \"bytes_read\":\ \"%^\",\ \"user_id\":\ \"%^\",\ \"duration\":\ \"%T\",\ \"size\":\ \"%b\",\ \"status\":\ %s}\t

no difference with or without the \t in the end.

my full goaccess.conf

# GoAccess main configuration
tz Europe/Copenhagen
addr 0.0.0.0
port {{ app_port }}
ws-url ws://{{internal_dns_name}}/{{subpath}}/ws

# Report
html-report-title Full Caddy Web Log Report

# Log parsing
date-format %Y/%m/%d
time-format %H:%M:%S

log-format %d\ %t.%^%^%^{\"request\":\ {\"remote_ip\":\ \"%^\",\ \"client_ip\":\ \"%h\",\ \"proto\":\ \"%H\",\ \"method\":\ \"%m\",\ \"host\":\ \"%v\",\ "uri":\ \"%U\"},\ \"bytes_read\":\ \"%^\",\ \"user_id\":\ \"%^\",\ \"duration\":\ \"%T\",\ \"size\":\ \"%b\",\ \"status\":\ %s}\t

# Realtime HTML
real-time-html true

# GeoIP (new - but needs to be downloaded and mounted)
geoip-database /usr/share/GeoIP/GeoLite2-Country.mmdb

# GeoIP (legacy)
# geoip-database /usr/share/GeoIP/GeoIP.dat

# Output
output /srv/report/index.html

I'm running the docker-version of goaccess with these parameters:
["/caddylog/error.log", "--config-file=/etc/goaccess/goaccess.conf", "--real-time-html", "--no-strict-status"]
or
["/caddylog/1ok2ignore.log", "--config-file=/etc/goaccess/goaccess.conf", "--real-time-html", "--no-strict-status"]

and I'm mounting my goaccess.conf on top of /etc/goaccess/goaccess.conf

I hope someone may have some insights.

PS: I am not using my.example.com in my real log files, but I changed the names to anonymise a bit.

PS: I don't know why my caddy access logs are a mix of lines and json, I never configured it, except creating logging blocks in my caddyfile.

Update: Half of solving a problem is describing it and recreating it in a minimized way.

I discovered that if I change the status value in error.log to 0 instead of 200 the log line was accepted. With a few random samples (0, 2, 4, 5, 49, 50, 20, 200, 400, 404, 500) it appears that %s accepts 0 or 2 digits, but not 3.

So I'm wondering if the problem really is that the log parser considers 200} as a token?

@grasdk are you able to change your caddy config to output json instead of a mixed format? I think can specify it as format json e.g.,

format json {
    time_format iso8601
}

Hi.

Yes I am able to do that, but then it will only show data from the time of the change, and I have a couple of years of logs that could be interesting to view.

I think the log format is very close, as all the other values are parsed correctly. But there is something with %s and json perhaps. I will try changing the order so %s and the status value comes earlier... Not that I can get Caddy to do that, but just to test if the problem is with %s or with the placement.

With your question about changing the log format, are you saying that making it work as-is is a lost cause?
In that case I might work on converting the existing logs instead.

Best regards

I'd do:

sed -E 's/^([0-9]{4}\/[0-9]{2}\/[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}).*(\{"request":.*),([ ]*)"([^"]*)"[ ]*:[ ]*([0-9]+|"[^"]*")([ ]*)\}/\2,\3"\4": \5\6, "date_time": "\1"}/' 1ok2ignore.log | goaccess - --log-format='{"request": { "remote_ip": "%h", "proto": "%H", "method": "%m", "host": "%v", "uri": "%U", "bytes_read": "%b"}, "user_id": "%e", "duration": "%T", "status": "%s", "date_time": "%d %t.%^" }' --date-format='%Y/%m/%d' --time-format='%H:%M:%S'

Image

@allinurl Thanks. I did a similar change, but changed the log format into the format that I reconfigured my caddy instance to use.

About that: In my early days of using Caddy, I made a template that had this log block:

	log {
		output file /log/https.{{external_dns_name}}.log
		format filter {
			wrap console
			fields {
				request>remote_port delete
				request>headers delete
				request>tls delete
				resp_headers delete
				request>uri query {
					replace access_token REDACTED
				}
			}
		}
	}

Changing the line with wrap console to wrap json changed the following logs to pure json (I also changed the log file names, so I have old logs and new logs).

After that, I applied a solution similar to yours, to convert the old logs to the pure json format.

So far so good, my immediate problem is solved.

I'm still a bit puzzled why it would parse "status":0} in my mixed format, but not "status":200}.