How to parse the nginx logs only into csv without summarizing it?
Opened this issue · 4 comments
If I run
goaccess --log-format='%h %u [%d:%t %z] "%r" %s %b "%R" "%U"' \
--date-format='%d/%b/%Y' \
--time-format='%H:%M:%S' \
--output csv <"$logfile" >"$outfile"
It only gives me a summary.
But I want the logs be parsed properly so I can feed it into a python pandas and analyze it myself.
PS:
GoAccess is limited that:
I cannot filter and zoom in a specific time range;
I cannot view or filter a specific ip on what files or url it accessed;
I'm not sure I understand your question. Could you clarify what you mean by 'into CSV without summarizing'
-
Current behavior:
It parse the nginx access log file.
And output a summarization of those logs. -
Expected behavior:
It parse the nginx access log file.
And output a csv format of the logs. Without summarizing them.-
(
Which is what all parser would normally do.
You input the data, and output the data in a correct format.
Only parse, dont summarize it.GoAccess must need to parse first then process the data right?
I just need the parser, not the processing and summarizing.
)
-
-
eg:
I have to use custom regex to parse the file, which is unsafe,^(?P<ip>\S+) (?P<user>\S+) \[(?P<timestamp>[^\]]+)\] "(?P<method>\w+) (?P<uri>\S+) (?P<protocol>\S+)" (?P<status>\d+) (?P<size>\d+) "(?P<referer>[^"]*)" "(?P<user_agent>[^"]+)"from
188.95.188.7 - [21/Aug/2025:00:10:18 +0800] "GET / HTTP/1.1" 200 27556 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.26 Safari/537.36" 188.95.188.7 - [21/Aug/2025:00:10:18 +0800] "GET /web/assets/images/footer/logo.svg?v=1751873492 HTTP/1.1" 200 1871 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.26 Safari/537.36" 188.101.249.19 - [21/Aug/2025:00:10:18 +0800] "GET / HTTP/1.1" 200 27556 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.43 Safari/537.36"to
"ip","user","timestamp","method","uri","protocol","status","size","referer","user_agent" "188.95.188.7","-","21/Aug/2025:00:10:18 +0800","GET","/","HTTP/1.1","200","27556","-","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.26 Safari/537.36" "188.95.188.7","-","21/Aug/2025:00:10:18 +0800","GET","/web/assets/images/footer/logo.svg?v=1751873492","HTTP/1.1","200","1871","-","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.26 Safari/537.36" "188.101.249.19","-","21/Aug/2025:00:10:18 +0800","GET","/","HTTP/1.1","200","27556","-","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.43 Safari/537.36"
It looks like you can already convert the log to CSV, so I'm not sure what added value GoAccess would bring if it's just about converting from plain log to CSV, no counting, or any other processing involved.