TPC-Council/HammerDB

datagenrun in autopilot-script stops data generation after a few seconds without error

Opened this issue · 9 comments

ek1220 commented

Guidance
Bug reports are for when HammerDB is not behaving as expected.
Bug reports should not be submitted for help in understanding database performance related questions.
General questions on database performance or HammerDB usability should be submitted under Discussions.

Describe the bug
Running a script to generate data using datagenrun with hammerdbcli auto stops a few seconds after start without any error or information.
The same lines in hammerdbcli finish successfully

To Reproduce
Steps to reproduce the behavior:

  1. create a file with these lines:
    dbset bm TPROC-C
    dgset warehouse 600
    dgset vu 10
    dgset directory "/SPACE/TPCC_DATA/600WH"
    print datagen
    datagenrun
  2. run the commands using './hammerdbcli auto file-datagen.tcl'
  3. In my case it stops after a few seconds and the logfile ends like this:
    Vuser 11:Opened File /SPACE/TPCC_DATA/600WH/order_line_10.tbl
    Timestamp 11 @ Fri Sep 27 04:59:08 PDT 2024
    Vuser 11:Generating Warehouse
    Timestamp 11 @ Fri Sep 27 04:59:08 PDT 2024
    Vuser 11:Generating Stock Wid=541

Expected behavior
Expected to behave the same way as other scripts using auto

Screenshots
If applicable, add screenshots to help explain your problem.

HammerDB Version (please complete the following information):

  • Version: 4.11
  • Build: Release download

HammerDB Interface (please complete the following information):

  • UI: CLI

Operating System (please complete the following information):

  • Server OS: Oracle Linux 9
  • Client OS: Oracle Linux 9

Database Server (please complete the following information):

  • Database name: Oracle
  • Database Release 19.23

Database Client (please complete the following information):

  • Database client name: NA

Additional context
Add any other context about the problem here.

Yes, what is happening here is that the main thread is starting the worker threads running and then falling off the end, causing them to stop. It needs the keepalive command to ensure the main thread waits until they are finished.

Thank you, that works. Unfortunately keepalive is not documented. I'm still using waittocomplete in other scripts.

I was too early with my reply. It works half-way, command stopped after around generating ~63GB of data.
This is what I did:
./hammerdbcli auto file_datagen.tcl
This is content of file_datagen.tcl:
dbset bm TPROC-C
dgset warehouse 1600
dgset vu 40
dgset directory "/TPCC_DATA/1600WH"
print datagen
datagenrun
keepalive

Hi, yes it is all in the documentation so runtimer and waittocomplete was deprecated at v4.6 and made automatic:
https://www.hammerdb.com/docs4.6/ch01s01.html#d0e44

If you run them you should get message that they are deprecated:

hammerdb>runtimer
runtimer command has been deprecated and is not required for version v4.12

hammerdb>waittocomplete
waittocomplete command has been deprecated and is not required for version v4.12

hammerdb>

The replacement of keepalive is also documented: https://www.hammerdb.com/docs/ch09s03.html

Note that the runtimer and waittocomplete parameters have been deprecated from v4.6. For this reason an additional configuration parameter of keepalive_margin with a default value of 10 seconds increasing to 60 seconds from v4.10 has been added to generic.xml in the commandline section to modify the additional time that HammerDB will wait after completion before terminating the workload. This can be useful if for example gathering timing data for event driven scaling workloads with a large number of asynchronous clients.

So now it is automated it is controlled by the parameter generic parameter keepalive_margin to determine how long it will wait for eg the following changes the default 1 minute (from v4.10) to 20 minutes:

giset commandline keepalive_margin 1200

Thank you, so I actually have to know in advance the maximum time to wait ?
For example 3200 warehouses take 35min to finish - depending on storage.

I checked my scrips and actually I use the following for tests which I don't know when they will end:

global complete
proc wait_to_complete {} {
global complete
set complete [vucomplete]
if {!$complete} { after 5000 wait_to_complete } else { exit }
}
dbset db ora
...
buildschema
wait_to_complete
vwait forever

Is this still a good practice ?

No, you don't need that for the current releases, there are examples in the scripts directory for all databases that can be run with tcl or python e.g. These are the best templates for what work.

#!/bin/tclsh
# maintainer: Pooja Jain

puts "SETTING CONFIGURATION"
dbset db maria
dbset bm TPC-C

diset connection maria_host localhost
diset connection maria_port 3306
diset connection maria_socket /tmp/mariadb.sock

set vu [ numberOfCPUs ]
set warehouse [ expr {$vu * 5} ]
diset tpcc maria_count_ware $warehouse
diset tpcc maria_num_vu $vu
diset tpcc maria_user root
diset tpcc maria_pass maria
diset tpcc maria_dbase tpcc
diset tpcc maria_storage_engine innodb
if { $warehouse >= 200 } { 
diset tpcc maria_partition true 
	} else {
diset tpcc maria_partition false 
	}
puts "SCHEMA BUILD STARTED"
buildschema
puts "SCHEMA BUILD COMPLETED"

also there are shell and poweshell scripts here that will do a build, check, run, delete and report with one command.

export TMP=`pwd`/TMP
mkdir -p $TMP

echo "BUILD HAMMERDB SCHEMA"
echo "+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-"
./hammerdbcli auto ./scripts/tcl/maria/tprocc/maria_tprocc_buildschema.tcl 
echo "+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-"
echo "CHECK HAMMERDB SCHEMA"
echo "+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-"
./hammerdbcli auto ./scripts/tcl/maria/tprocc/maria_tprocc_checkschema.tcl
echo "+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-"
echo "RUN HAMMERDB TEST"
echo "+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-"
./hammerdbcli auto ./scripts/tcl/maria/tprocc/maria_tprocc_run.tcl 
echo "+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-"
echo "DROP HAMMERDB SCHEMA"
./hammerdbcli auto ./scripts/tcl/maria/tprocc/maria_tprocc_deleteschema.tcl
echo "+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-"
echo "HAMMERDB RESULT"
./hammerdbcli auto ./scripts/tcl/maria/tprocc/maria_tprocc_result.tcl 

Thank you, I'll adjust my scripts.
Is keepalive_margin the only way to keep the main process going ? Actually I found wait_to_complete quite comfortable as I did not have to estimate the time for buildschema.

waittocomplete is now the hidden command _waittocomplete and is called automatically by build schema and TPROC-H runs. This waits until vucomplete returns true, so in the rare event a virtual user does not complete it can wait forever. keepalive is more for vurun where it waits for the rampup + test duration + keepalive margin and will kill any virtual users that exceed this time.
So yes for datagen _waittocomplete is probably better so I will change the PR, this will keep datagen running until vucomplete returns true. You shouldn't need either now when writing scripts as HammerDB will do it for you. (The GUI + interactive is already in the event loop so don't need either).

ok, thank you.