openresty/stapxx

semantic error: type definition 'GCcdataVar' not found

Closed this issue · 6 comments

Hi,

@agentzh

i'm using this wonderful tools to investigate a serious problem Lua VM crashed, not enoug memory, but i got a error as topic said.

./samples/lj-gc-objs.sxx -x 3508

Found exact match for libluajit: /opt/router/openresty/luajit/lib/libluajit-5.1.so.2.1.0
semantic error: type definition 'GCcdataVar' not found in '/opt/router/openresty/luajit/lib/libluajit-5.1.so.2.1.0': operator '@cast' at stapxx-96ZDvAw9/luajit.stp:63:31
        source: @define sizeof_GCcdataVar %( &@cast(0, "GCcdataVar", "/opt/router/openresty/luajit/lib/libluajit-5.1.so.2.1.0")[1] %)
                                              ^
    in expansion of macro: operator '@sizeof_GCcdataVar' at stapxx-96ZDvAw9/luajit.stp:275:27
        source:             cdatav = cd - @sizeof_GCcdataVar
                                          ^

Pass 2: analysis failed.  [man error::pass2]

./nginx/sbin/nginx -vV

nginx version: openresty/1.9.7.4
built by gcc 4.1.2 20080704 (Red Hat 4.1.2-55)
built with OpenSSL 1.0.2g  1 Mar 2016
TLS SNI support enabled
configure arguments: --prefix=/opt/router/openresty/nginx --with-cc-opt=-O2 --add-module=../ngx_devel_kit-0.2.19 --add-module=../echo-nginx-module-0.58 --add-module=../xss-nginx-module-0.05 --add-module=../ngx_coolkit-0.2rc3 --add-module=../set-misc-nginx-module-0.30 --add-module=../form-input-nginx-module-0.11 --add-module=../encrypted-session-nginx-module-0.04 --add-module=../srcache-nginx-module-0.30 --add-module=../ngx_lua-0.10.2 --add-module=../ngx_lua_upstream-0.05 --add-module=../headers-more-nginx-module-0.29 --add-module=../array-var-nginx-module-0.05 --add-module=../memc-nginx-module-0.16 --add-module=../redis2-nginx-module-0.12 --add-module=../redis-nginx-module-0.3.7 --add-module=../rds-json-nginx-module-0.14 --add-module=../rds-csv-nginx-module-0.07 --with-ld-opt=-Wl,-rpath,/opt/router/openresty/luajit/lib --with-pcre=/mnt/compile/nginx/openresty-1.9.7.4/../pcre-8.36 --with-zlib=/mnt/compile/nginx/openresty-1.9.7.4/../zlib-1.2.8 --with-openssl=/mnt/compile/nginx/openresty-1.9.7.4/../openssl-1.0.2g --with-pcre-jit --add-module=/mnt/compile/nginx/openresty-1.9.7.4/../ngx_http_dyups_module-0.2.9+ --with-http_stub_status_module --with-http_ssl_module --with-http_gzip_static_module --with-openssl-opt=enable-tlsext

can you give me some help about this?

Thanks.

@guanglinlv What's the version of your systemtap? Try upgrading to the latest version of systemtap and elfutils? BTW, as the last resort, you can use the gdb equivalents in the nginx-gdb-utils github repo.

@agentzh

That error occurs both my production and development environment. yeah, i actually try to use nginx-gdb-utils in my production, but failure with lower gdb version(7.2) .so i back to stapxx.

following is my development environment, CentOS Linux release 7.1.1503, it's the latest version checked by yum.

systemtap-runtime-2.8-10.el7.x86_64
systemtap-client-2.8-10.el7.x86_64
systemtap-2.8-10.el7.x86_64
systemtap-devel-2.8-10.el7.x86_64
elfutils-libelf-0.163-3.el7.x86_64
elfutils-libs-0.163-3.el7.x86_64
elfutils-0.163-3.el7.x86_64

BTW, do you have some suggestion about Lua VM crashed reason: not enogh memory.

i had investigate it follow #148. but the gc is only 200MB, is it means that not hit the hard-coded memory upper limit?

so, what's the possible cause? too big and too much request body or response body? memory leak?

thank you.

@guanglinlv That error means that you hit the 1G memory limit on x86_64. Possible causes are

  1. memory leak in your Lua code, like an ever growing Lua table or Lua string.
  2. intermittent aggressive allocations of many temporary GC objects that are much faster than the GC can catch up, thus hitting the ceiling.
  3. Some other C land allocations happen to take address space in the lowest 1GB address space of the nginx worker processes, squeezing further the address space that can be used by LuaJIT.

But we really need to get the tools working to be sure. Seems like you can just compile the latest version of gdb from source in your machine (and install it to, say, /opt/gdb)?

@agentzh i use nginx-gdb-utils in my development environment, but also get the same GCcdataVar error.

  • gdb error
(gdb) source luajit2
luajit20.gdb  luajit21.py   
(gdb) source luajit21.py 
(gdb) lgcstat
Python Exception <class 'gdb.error'> No type named GCcdataVar.: 
Error occurred in Python command: No type named GCcdataVar.
(gdb) 
  • systemtap/elf/gdb version
systemtap-runtime-2.8-10.el7.x86_64
systemtap-client-2.8-10.el7.x86_64
systemtap-2.8-10.el7.x86_64
systemtap-devel-2.8-10.el7.x86_64
elfutils-libelf-0.163-3.el7.x86_64
elfutils-libs-0.163-3.el7.x86_64
elfutils-0.163-3.el7.x86_64
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.

@agentzh the cause is the lower gcc version, it works after recompiling it with gcc 4.8. thanks a lot.

closing it.

@guanglinlv Ah, yes, gcc older than 4.5 generates suboptimal debuginfo.