cisco-system-traffic-generator/trex-core

ASTF mode crashes

zyddnys opened this issue · 4 comments

I am using TRex ASTF to generate TCP traffic through a switch with two queues followed by a bottleneck link, the weights (DWRR) of the queues are randomly assigned every 1 second.
After running for a while TRex will crash.
I am using the latest branch of TRex compiled from this repo.
TRex is launched using sudo ./t-rex-64 -i --astf -c 20
The file I used to generate traffic is attached here.

from trex.astf.api import *
import argparse



class Prof1():
    def __init__(self):
        pass  # tunables

    def create_profile(self):
        # ip generator
        ip_gen_c = ASTFIPGenDist(ip_range=["1.0.0.1", "1.0.0.255"], distribution="seq")
        ip_gen_s = ASTFIPGenDist(ip_range=["2.0.0.1", "2.0.255.255"], distribution="seq")
        ip_gen_c2 = ASTFIPGenDist(ip_range=["3.0.0.1", "3.0.0.255"], distribution="seq")
        ip_gen_s2 = ASTFIPGenDist(ip_range=["4.0.0.1", "4.0.255.255"], distribution="seq")
        ip_gen = ASTFIPGen(glob=ASTFIPGenGlobal(ip_offset="1.0.0.0"),
                           dist_client=ip_gen_c,
                           dist_server=ip_gen_s)
        ip_gen2 = ASTFIPGen(glob=ASTFIPGenGlobal(ip_offset="1.0.0.0"),
                           dist_client=ip_gen_c2,
                           dist_server=ip_gen_s2)

        profile = ASTFProfile(default_ip_gen=ip_gen, cap_list=[
            ASTFCapInfo(file="../avl/delay_10_http_get_0.pcap", cps=102.0,port=8080),
            ASTFCapInfo(file="../avl/delay_10_http_post_0.pcap", cps=102.0,port=8081),
            ASTFCapInfo(file="../avl/delay_10_https_0.pcap", cps=33.0),
            ASTFCapInfo(file="../avl/delay_10_http_browsing_0.pcap", cps=179.0),
            ASTFCapInfo(file="../avl/delay_10_http_browsing_0.pcap", ip_gen=ip_gen2, cps=180.0, port=9999),


            ASTFCapInfo(file="../avl/delay_10_exchange_0.pcap", cps=64.0),

            ASTFCapInfo(file="../avl/delay_10_mail_pop_0.pcap", cps=1.2),
            ASTFCapInfo(file="../avl/delay_10_mail_pop_1.pcap", cps=1.2,port=111),
            ASTFCapInfo(file="../avl/delay_10_mail_pop_2.pcap", cps=1.2,port=112),

            ASTFCapInfo(file="../avl/delay_10_oracle_0.pcap", cps=20.0),

            ASTFCapInfo(file="../avl/delay_10_rtp_160k_0.pcap", cps=0.7),
            ASTFCapInfo(file="../avl/delay_10_rtp_160k_1.pcap", cps=0.7),
            ASTFCapInfo(file="../avl/delay_10_rtp_250k_0_0.pcap", cps=0.5),
            ASTFCapInfo(file="../avl/delay_10_rtp_250k_1_0.pcap", cps=0.5),

            ASTFCapInfo(file="../avl/delay_10_smtp_0.pcap", cps=1.85),
            ASTFCapInfo(file="../avl/delay_10_smtp_1.pcap", cps=1.85,port=26),
            ASTFCapInfo(file="../avl/delay_10_smtp_2.pcap", cps=1.85,port=27),

            ASTFCapInfo(file="../avl/delay_10_video_call_0.pcap", cps=3),

            ASTFCapInfo(file="../avl/delay_10_video_call_rtp_0.pcap", cps=7.4),

            ASTFCapInfo(file="../avl/delay_10_citrix_0.pcap", cps=11.0),

            ASTFCapInfo(file="../avl/delay_10_dns_0.pcap", cps=498.0),

            ASTFCapInfo(file="../avl/delay_10_sip_0.pcap", cps=7.4),
            ASTFCapInfo(file="../avl/delay_10_rtsp_0.pcap", cps=1.2),

        ])

        return profile

    def get_profile(self, tunables, **kwargs):
        parser = argparse.ArgumentParser(description='Argparser for {}'.format(os.path.basename(__file__)), 
                                         formatter_class=argparse.ArgumentDefaultsHelpFormatter)

        args = parser.parse_args(tunables)
        return self.create_profile()


def register():
    return Prof1()

Error message is here:

WATCHDOG: task 'Trex DP core 6' has not responded for more than 1.00897 seconds - timeout is 1 seconds

*** traceback follows ***

1       0x560eb6dfca2a ./_t-rex-64(+0x2fea2a) [0x560eb6dfca2a]
2       0x7f3a3dcc5980 /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f3a3dcc5980]
3       0x560eb70163da ./_t-rex-64(+0x5183da) [0x560eb70163da]
4       0x560eb6d79fbf tcp_pktmbuf_alloc(unsigned char, unsigned short) + 1007
5       0x560eb6d7aa7c CUdpFlow::alloc_and_build(CMbufBuffer*) + 76
6       0x560eb6d9d267 CUdpFlow::send_pkt(CMbufBuffer*) + 39
7       0x560eb6d79660 CEmulApp::next() + 1840
8       0x560eb6d98545 CEmulApp::on_bh_rx_pkts(unsigned int, rte_mbuf*) + 101
9       0x560eb6d9d6c1 CUdpFlow::on_rx_packet(rte_mbuf*, UDPHeader*, int, int) + 193
10      0x560eb6d91fb3 CFlowTable::rx_handle_packet_udp_no_flow(CTcpPerThreadCtx*, rte_mbuf*, CHashEntry<flow_key_t>*, CSimplePacketParser&, CFlowKeyTuple&, CFlowKeyFullTuple&, unsigned int, unsigned char) + 819
11      0x560eb6d78d20 CFlowTable::rx_handle_packet(CTcpPerThreadCtx*, rte_mbuf*, bool, unsigned char) + 464
12      0x560eb6daeb53 unsigned short CFlowGenListPerThread::handle_rx_pkts<false>(bool) + 563
13      0x560eb6dad1ed CFlowGenListPerThread::handle_rx_flush(CGenNode*, bool) + 333
14      0x560eb6ddacdf CNodeGenerator::handle_slow_messages(unsigned char, CGenNode*, CFlowGenListPerThread*, bool) + 399
15      0x560eb6d7d3c5 int CNodeGenerator::flush_file_realtime<24, false>(double, double, CFlowGenListPerThread*, double&) + 1829
16      0x560eb6ddadbd CNodeGenerator::flush_file(double, double, bool, CFlowGenListPerThread*, double&) + 77
17      0x560eb6ff012d TrexAstfDpCore::start_scheduler() + 1389
18      0x560eb6f27e91 TrexDpCore::start() + 49
19      0x560eb6dcf846 CFlowGenListPerThread::start(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, CPreviewMode&) + 262
20      0x560eb6c5bafe CGlobalTRex::run_in_core(unsigned char) + 448
21      0x560eb6d802c7 ./_t-rex-64(+0x2822c7) [0x560eb6d802c7]
22      0x560eb6cf73ed ./_t-rex-64(+0x1f93ed) [0x560eb6cf73ed]
23      0x7f3a3dcba6db /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f3a3dcba6db]
24      0x7f3a3cbff61f clone + 63


*** addr2line information follows ***
hhaim commented

@zyddnys I assume that there is no issue when the setup is in loopback.
I think you are hitting a timeout (lock) issue in the buffers allocation, ever reduce the rate or disable the watchdog

How do you change the timeout or disable watchdog?

hhaim commented

add --no-watchdog to CLI when you start the server

No more crashes now, thanks for the help