logstash-plugins/logstash-filter-geoip

GeoLite database auto update error

tangguojuju opened this issue · 0 comments

1. Environment introduction:

  • Logstash version: 7.17.8
  • Configuration file: Only logstash.conf has been modified, other configuration files are by default
input {
....
}
filter {
  geoip {
    source => "proxy_protocol_addr"
    target => "address"
    add_field => {"[country_name]" => "%{[address][country_name]}" } 
    ecs_compatibility => disabled 
  }
}
output {
....
}
  • Environment variable: http_proxy="http://PROXY_IP:PROXY_PORT"

2. Start as ECS task

  • MaxMind says the database is updated Tuesday and Friday (EST)
  • The task start time is 2023/11/7 7:34 (JST)
  • The database file structure at 2023/11/7 7:36 (the database is not updated) is:
[ec2-user@ip-*-*-*-*   data]$ cd plugins/filters/geoip/
[ec2-user@ip-*-*-*-* geoip]$ ll
total 12
drwxr-sr-x 2 root root 6144   Nov  7 07:36 1699310163
drwxr-sr-x 2 root root 6144 Oct 25 11:48   CC
-rw-r--r-- 1 root root  129 Nov    7 07:36 metadata.csv
[ec2-user@ip-*-*-*-* geoip]$ ll   1699310163/
total 108644
-rw-r--r-- 1 root root       55 Nov    7 07:36 COPYRIGHT.txt
-rw-r--r-- 1 root root      162 Nov    7 07:36 elastic-geoip-database-service-agreement-LICENSE.txt
-rw-r--r-- 1 root root  8259666 Nov    7 07:36 GeoLite2-ASN.mmdb
-rw------- 1 root root  4586190 Nov    7 07:36 GeoLite2-ASN.tgz
-rw-r--r-- 1 root root   66295079 Nov  7 07:36   GeoLite2-City.mmdb
-rw------- 1 root root   32084851 Nov  7 07:36 GeoLite2-City.tgz
-rw-r--r-- 1 root root      398 Nov    7 07:36 LICENSE.txt
-rw-r--r-- 1 root root      116 Nov    7 07:36 README.txt
  • The database file structure at 2023/11/8 7:36 (the database is updated) is:
[ec2-user@ip-*-*-*-*   data]$ cd plugins/filters/geoip/
[ec2-user@ip-*-*-*-* geoip]$ ll
total 16
drwxr-sr-x 2 root root 6144   Nov  8 07:36 1699310163
drwxr-sr-x 2 root root 6144   Nov  8 07:36 1699396588
drwxr-sr-x 2 root root 6144 Oct 25 11:48   CC
-rw-r--r-- 1 root root  129 Nov    8 07:36 metadata.csv
[ec2-user@ip-*-*-*-*   logstash-consumer-ip-trans-jhc]$ ll data/plugins/filters/geoip/1699310163/
total 0
[ec2-user@ip-*-*-*-*   logstash-consumer-ip-trans-jhc]$ ll data/plugins/filters/geoip/1699396588/
total 108644
-rw-r--r-- 1 root root       55 Nov    8 07:36 COPYRIGHT.txt
-rw-r--r-- 1 root root      162 Nov    8 07:36 elastic-geoip-database-service-agreement-LICENSE.txt
-rw-r--r-- 1 root root  8259666 Nov    8 07:36 GeoLite2-ASN.mmdb
-rw------- 1 root root  4586196 Nov    8 07:36 GeoLite2-ASN.tgz
-rw-r--r-- 1 root root   66295079 Nov  8 07:36   GeoLite2-City.mmdb
-rw------- 1 root root   32084858 Nov  8 07:36 GeoLite2-City.tgz
-rw-r--r-- 1 root root      398 Nov    8 07:36 LICENSE.txt
-rw-r--r-- 1 root root      116 Nov    8 07:36 README.txt

Problem description: After checking the changes in the above dates, it was proved that the new database has been downloaded and the old database itself has been deleted, but the old database directory has not been deleted (it should be deleted under normal circumstances). What is the reason for this?

3. Check the log information:

  • Error log : 2023-11-08T07:36:50.940+09:00 { 2058 rufus-scheduler intercepted an error:
  • The complete log is as follows :
2023-11-08T07:36:28.872+09:00	[2023-11-07T22:36:28,871][INFO ][logstash.filters.geoip.downloadmanager] new database version detected? true
2023-11-08T07:36:50.812+09:00	[2023-11-07T22:36:50,811][INFO ][logstash.filters.geoip.databasemanager] geoip plugin will use database /usr/share/logstash/data/plugins/filters/geoip/1699396588/GeoLite2-City.mmdb {:database_type=>"City", :pipeline_ids=>["main"]}
2023-11-08T07:36:50.940+09:00	{ 2058 rufus-scheduler intercepted an error:     <--- Error log is here
2023-11-08T07:36:50.940+09:00	2058 job:
2023-11-08T07:36:50.941+09:00	2058 Rufus::Scheduler::EveryJob "24h" {}
2023-11-08T07:36:50.941+09:00	2058 error:
2023-11-08T07:36:50.941+09:00	2058 2058
2023-11-08T07:36:50.941+09:00	2058 Errno::ENOTEMPTY
2023-11-08T07:36:50.941+09:00	2058 Directory not empty - /usr/share/logstash/data/plugins/filters/geoip/1699310163
2023-11-08T07:36:50.942+09:00	2058 org/jruby/RubyDir.java:473:in `rmdir'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1432:in `block in remove_dir1'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1448:in `platform_support'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1431:in `remove_dir1'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1424:in `remove'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:763:in `block in remove_entry'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1481:in `postorder_traverse'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:761:in `remove_entry'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:613:in `block in rm_r'
2023-11-08T07:36:50.942+09:00	2058 org/jruby/RubyArray.java:1821:in `each'
2023-11-08T07:36:50.942+09:00	2058 uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:609:in `rm_r'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/x-pack/lib/filters/geoip/database_manager.rb:197:in `block in clean_up_database'
2023-11-08T07:36:50.942+09:00	2058 org/jruby/RubyArray.java:1821:in `each'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/x-pack/lib/filters/geoip/database_manager.rb:195:in `clean_up_database'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/x-pack/lib/filters/geoip/database_manager.rb:132:in `execute_download_job'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/x-pack/lib/filters/geoip/database_manager.rb:221:in `call'
2023-11-08T07:36:50.942+09:00	2058 org/jruby/RubyMethod.java:131:in `call'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:234:in `do_call'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:258:in `do_trigger'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:300:in `block in start_work_thread'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:299:in `block in start_work_thread'
2023-11-08T07:36:50.942+09:00	2058 org/jruby/RubyKernel.java:1442:in `loop'
2023-11-08T07:36:50.942+09:00	2058 /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:289:in `block in start_work_thread'
2023-11-08T07:36:50.942+09:00	2058 tz:
2023-11-08T07:36:50.943+09:00	2058 ENV['TZ']:
2023-11-08T07:36:50.943+09:00	2058 Time.now: 2023-11-07 22:36:50 +0000
2023-11-08T07:36:50.943+09:00	2058 scheduler:
2023-11-08T07:36:50.943+09:00	2058 object_id: 2054
2023-11-08T07:36:50.943+09:00	2058 opts:
2023-11-08T07:36:50.944+09:00	2058 {:max_work_threads=>1}
2023-11-08T07:36:50.944+09:00	2058 frequency: 0.3
2023-11-08T07:36:50.944+09:00	2058 scheduler_lock: #<Rufus::Scheduler::NullLock:0x6689703b>
2023-11-08T07:36:50.945+09:00	2058 trigger_lock: #<Rufus::Scheduler::NullLock:0x6efddd4d>
2023-11-08T07:36:50.953+09:00	2058 uptime: 86422.790969 (1d22s791)
2023-11-08T07:36:50.954+09:00	2058 down?: false
2023-11-08T07:36:50.954+09:00	2058 threads: 2
2023-11-08T07:36:50.954+09:00	2058 thread: #<Thread:0x63d380d8>
2023-11-08T07:36:50.954+09:00	2058 thread_key: rufus_scheduler_2054
2023-11-08T07:36:50.955+09:00	2058 work_threads: 1
2023-11-08T07:36:50.955+09:00	2058 active: 1
2023-11-08T07:36:50.956+09:00	2058 vacant: 0
2023-11-08T07:36:50.956+09:00	2058 max_work_threads: 1
2023-11-08T07:36:50.956+09:00	2058 mutexes: {}
2023-11-08T07:36:50.960+09:00	2058 jobs: 1
2023-11-08T07:36:50.961+09:00	2058 at_jobs: 0
2023-11-08T07:36:50.963+09:00	2058 in_jobs: 0
2023-11-08T07:36:50.964+09:00	2058 every_jobs: 1
2023-11-08T07:36:50.966+09:00	2058 interval_jobs: 0
2023-11-08T07:36:50.967+09:00	2058 cron_jobs: 0
2023-11-08T07:36:50.971+09:00	2058 running_jobs: 1
2023-11-08T07:36:50.971+09:00	2058 work_queue: 0
2023-11-08T07:36:50.971+09:00	} 2058 .
2023-11-08T10:57:09.019+09:00	[2023-11-08T01:57:09,018][WARN ][deprecation.logstash.codecs.plain][main][da2e20003ed2e191044ee5cc10e73f1c2a54b163c31804c15f3d6b2fd908f8fd] Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.

4. My question:

  1. Why is the old database directory not deleted?
  2. How to solve the rufus-scheduler intercepted an error error in the log?
  3. I think the error is caused by resources being occupied by other threads when deleting the old database directory. Do I need to modify any configuration files?