allinurl/goaccess

Parse logs from Google Logs Explorer with GoAccess?

Opened this issue · 4 comments

semla commented

is it possible to open a log that looks like:

[
  {
    "httpRequest": {
      "latency": "0.000186s",
      "protocol": "https",
      "remoteIp": "185.xxx.xx.xx",
      "requestMethod": "GET",
      "requestUrl": "https://the-site.com",
      "responseSize": "24162",
      "status": 200,
      "userAgent": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Mobile Safari/537.36"
    },
    "insertId": "2025-10-26T18:42:4149625",
    "jsonPayload": {
      "@type": "type.googleapis.com/google.firebase.hosting.v1beta1.logging.v1.Payload",
      "billable": true,
      "contentType": "text/html; charset=utf-8",
      "hostname": "the-site.com",
      "remoteIpCity": "velizy-villacoublay",
      "remoteIpCountry": "FR"
    },
    "logName": "projects/the-project/logs/firebasehosting.googleapis.com%2Fwebrequests",
    "receiveTimestamp": "2025-10-26T18:51:52.509922791Z",
    "resource": {
      "labels": {
        "domain_name": "the-site.com",
        "project_id": "the-project",
        "site_name": "the-site"
      },
      "type": "firebase_domain"
    },
    "timestamp": "2025-10-26T18:42:41Z"
  }
]

it is from the Google Logs Explorer

I think this should get you there:

goaccess access.log --log-format='{ "httpRequest": { "remoteIp": "%h", "requestMethod": "%m", "requestUrl": "%U", "responseSize": "%b", "status": "%s", "userAgent": "%u" }, "jsonPayload": { "contentType": "%M", "hostname": "%v" }, "logName": "%e", "receiveTimestamp": "%dT%t.%^" }' --date-format=%Y-%m-%d --time-format=%T --date-spec=min
Image
[
    {
        "httpRequest": {
          "latency": "0.446137s",
          "remoteIp": "115.164.211.111",
          "requestMethod": "POST",
          "requestSize": "309",
          "requestUrl": "https://www.google.com/v1/users",
          "responseSize": "67",
          "serverIp": "10.73.0.3",
          "status": 200,
          "userAgent": "okhttp/5.0.0-alpha.2"
        },
        "insertId": "1234567890",
        "jsonPayload": {
          "@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
          "backendTargetProjectNumber": "projects/123456789",
          "cacheDecision": [
            "RESPONSE_HAS_CACHE_CONTROL",
            "RESPONSE_HAS_CONTENT_TYPE",
            "REQUEST_HAS_AUTHORIZATION",
            "CACHE_MODE_USE_ORIGIN_HEADERS"
          ],
          "remoteIp": "115.164.211.111",
          "statusDetails": "response_sent_by_backend"
        },
        "logName": "projects/123456789/logs/requests",
        "receiveTimestamp": "2025-10-07T02:48:20.967052137Z",
        "resource": {
          "labels": {
            "backend_service_name": "1234567890",
            "forwarding_rule_name": "joker",
            "project_id": "123456789",
            "target_proxy_name": "1234567890",
            "url_map_name": "1234567890",
            "zone": "global"
          },
          "type": "http_load_balancer"
        },
        "severity": "INFO",
        "spanId": "1234567890",
        "timestamp": "2025-10-07T02:48:18.858256Z",
        "trace": "projects/123456789/traces/1234567890"
      }
]

i run your cmd, i get a error, can you help me?

goaccess nginx.json --log-format='{ "httpRequest": { "remoteIp": "%h", "requestMethod": "%m", "requestUrl": "%U", "responseSize": "%b", "status": "%s", "userAgent": "%u" }, "jsonPayload": { "contentType": "%M", "hostname": "%v" }, "logName": "%e", "receiveTimestamp": "%dT%t.%^" }' --date-format=%Y-%m-%d --time-format=%T --date-spec=min 
Cleaning up resources...
==78697== GoAccess - version 1.9.4 - Apr  1 2025 01:15:39
==78697== Config file: /opt/homebrew/Cellar/goaccess/1.9.4/etc/goaccess/goaccess.conf
==78697== https://goaccess.io - <hello@goaccess.io>
==78697== Released under the MIT License.
==78697==
==78697== FILE: nginx.json
==78697== 已解析 10 行 出现以下错误:
==78697==
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697== IPv4/6 is required.
==78697==
==78697== 格式错误 - 请检查你的日志/日期/时间格式

i get it, don't keep valid json format, it should is multi line json

{xxx}
{xxx}
{xxx}

this is google cloud query sh, it can auto query gcloud parse json file

#!/bin/bash

# --- Variable Configuration ---

START_TIME=$(date -v-1m -u +"%Y-%m-%dT00:00:00Z")
END_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
MAX_ENTRIES=100

# Forwarding rule name for the load balancer
FORWARDING_RULE_NAME="foobar"

# **Resource type filter for LB access logs**
# Please select according to your LB type or use a common filter.
# Common filters include:
# resource.type="http_load_balancer"
# resource.type="https_load_balancer"
# resource.type="tcp_load_balancer"
LOG_FILTER="resource.type=\"http_load_balancer\"
            resource.labels.forwarding_rule_name=\"$FORWARDING_RULE_NAME\""





################################################################################
# Do not modify the content below
################################################################################
OUTPUT_FILE="google_cloud_logging.json"
NGINX_FILE="goaccess.json"
REPORT_FILE="report.html"



# --- Command Detection ---

# Show installation hints
show_install_hint() {
    case "$1" in
        "gcloud")
            echo ""
            echo "     brew install --cask google-cloud-sdk"
            echo "     gcloud init"
            echo "     gcloud auth login"
            ;;
        "goaccess")
            echo ""
            echo "     brew install goaccess"
            ;;
        "jq")
            echo ""
            echo "     brew install jq"
            ;;
        *)
            echo "   Please install $1 and retry"
            ;;
    esac
}

# Check if required commands exist
check_command() {
    if ! command -v "$1" &> /dev/null; then
        echo "❌ Error: Command '$1' not found"
        show_install_hint "$1"
        exit 1
    fi
}

echo "🔍 Checking required commands..."
check_command "gcloud"
check_command "goaccess"
check_command "jq"
echo "✅ All required commands are ready"


# --- Gcloud Query Command ---

echo "🚀 Starting Google Cloud Logging query..."
echo "  - Date range: $START_TIME to $END_TIME"
echo "  - Target count: $MAX_ENTRIES entries"
echo "  - Filter condition: $LOG_FILTER"

# Build complete log query expression
FULL_QUERY="$LOG_FILTER AND timestamp>=\"$START_TIME\" AND timestamp<=\"$END_TIME\""

# Execute gcloud logging read command
# --limit: Limit the number of entries returned
# --format=json: Output in JSON format for easy subsequent processing
# >: Redirect output to file
gcloud logging read "$FULL_QUERY" \
    --limit="$MAX_ENTRIES" \
    --format=json \
    --order=asc \
    > "$OUTPUT_FILE"

# --- Result Summary ---

if [ $? -eq 0 ]; then
    # Use grep to count lines in JSON file,
    # but the more accurate entry count is what gcloud returns.
    # Simple line counting may be slightly inaccurate due to JSON formatting, but can serve as a rough reference.
    ACTUAL_LINES=$(grep -c '^{' "$OUTPUT_FILE")
    
    echo "✅ Log export successful!"
    echo "  - File name: $OUTPUT_FILE"
    echo "  - Actual log entries written (estimated): $ACTUAL_LINES entries"
    echo "  - You can now process the $OUTPUT_FILE file."
else
    echo "❌ Warning: gcloud command execution failed or was interrupted."
fi


## Convert output file to goaccess log format
jq -c '.[]' $OUTPUT_FILE > $NGINX_FILE
## Start analysis
goaccess $NGINX_FILE --log-format='{ "httpRequest": { "remoteIp": "%h", "requestMethod": "%m", "requestUrl": "%U", "responseSize": "%b", "status": "%s", "userAgent": "%u" }, "jsonPayload": { "contentType": "%M", "hostname": "%v" }, "logName": "%e", "receiveTimestamp": "%dT%t.%^" }' --date-format=%Y-%m-%d --time-format=%T --date-spec=min  -a -o $REPORT_FILE
open $REPORT_FILE