/fpga-network-stack

Scalable Network Stack for FPGAs (TCP/IP, RoCEv2)

Primary LanguageC++BSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Scalable Network Stack supporting TCP/IP, RoCEv2, UDP/IP at 10-100Gbit/s

Prerequisites

  • Xilinx Vivado 2022.2
  • cmake 3.0 or higher

Supported boards (out of the box)

  • Xilinx VC709
  • Xilinx VCU118
  • Alpha Data ADM-PCIE-7V3

Git submodules

This repository uses git submodules, so do one of the following:

# When cloning:
git clone --recurse-submodules git@url.to/this/repo.git
# Later, if you forgot or when submodules have been updated:
git submodule update --init --recursive

Compiling HLS modules

  1. Create a build directory
mkdir build
cd build
  1. Configure build
cmake .. -DFNS_PLATFORM=xilinx_u55c_gen3x16_xdma_3_202210_1 -DFNS_DATA_WIDTH=64

All cmake options:

Name Values Desription
FNS_PLATFORM xilinx_u55c_gen3x16_xdma_3_202210_1 Target platform to build
FNS_DATA_WIDTH <8,16,32,64> Data width of the network stack in bytes
FNS_ROCE_STACK_MAX_QPS 500 Maximum number of queue pairs the RoCE stack can support
FNS_TCP_STACK_MSS #value Maximum segment size of the TCP/IP stack
FNS_TCP_STACK_FAST_RETRANSMIT_EN <0,1> Enabling TCP fast retransmit
FNS_TCP_STACK_NODELAY_EN <0,1> Toggles Nagle's Algorithm on/off
FNS_TCP_STACK_MAX_SESSIONS #value Maximum number of sessions the TCP/IP stack can support
FNS_TCP_STACK_RX_DDR_BYPASS_EN <0,1> Enabling DDR bypass on the RX path
FNS_TCP_STACK_WINDOW_SCALING_EN <0,1> Enalbing TCP Window scaling option
  1. Build HLS IP cores and install them into IP repository
make ip

For an example project including the TCP/IP stack or the RoCEv2 stack with DMA to host memory checkout our Distributed Accelerator OS DavOS.

Working with individual HLS modules

  1. Setup build directory, e.g. for the TCP module
cd hls/toe
mkdir build
cd build
cmake .. -DFNS_PLATFORM=xilinx_u55c_gen3x16_xdma_3_202210_1 -DFNS_DATA_WIDTH=64
  1. Run
make csim # C-Simulation (csim_design)
make synth # Synthesis (csynth_design)
make cosim # Co-Simulation (cosim_design)
make ip # Export IP (export_design)

Interfaces

All interfaces are using the AXI4-Stream protocol. For AXI4-Streams carrying network/data packets, we use the following definition in HLS:

template <int D>
struct net_axis {
	ap_uint<D>    data;
	ap_uint<D/8>  keep;
	ap_uint<1>    last;
};

TCP/IP

Open Connection

To open a connection the destination IP address and TCP port have to provided through the s_axis_open_conn_req interface. The TCP stack provides an answer to this request through the m_axis_open_conn_rsp interface which provides the sessionID and a boolean indicating if the connection was openend successfully.

Interface definition in HLS:

struct ipTuple {
	ap_uint<32>	ip_address;
	ap_uint<16>	ip_port;
};
struct openStatus {
	ap_uint<16>	sessionID;
	bool		success;
};

void toe(...
	hls::stream<ipTuple>& openConnReq,
	hls::stream<openStatus>& openConnRsp,
	...);

Close Connection

To close a connection the sessionID has to be provided to the s_axis_close_conn_req interface. The TCP/IP stack does not provide a notification upon completion of this request, however it is guranteeed that the connection is closed eventually.

Interface definition in HLS:

hls::stream<ap_uint<16> >& closeConnReq,

Open a TCP port to listen on

To open a port to listen on (e.g. as a server), the port number has to be provided to s_axis_listen_port_req. The port number has to be in range of active ports: 0 - 32767. The TCP stack will respond through the m_axis_listen_port_rsp interface indicating if the port was set to the listen state succesfully.

Interface definition in HLS:

hls::stream<ap_uint<16> >& listenPortReq,
hls::stream<bool>& listenPortRsp,

Receiving notifications from the TCP stack

The application using the TCP stack can receive notifications through the m_axis_notification interface. The notifications either indicate that new data is available or that a connection was closed.

Interface definition in HLS:

struct appNotification {
	ap_uint<16>			sessionID;
	ap_uint<16>			length;
	ap_uint<32>			ipAddress;
	ap_uint<16>			dstPort;
	bool				closed;
};

hls::stream<appNotification>& notification,

Receiving data

If data is available on a TCP/IP session, i.e. a notification was received. Then this data can be requested through the s_axis_rx_data_req interface. The data as well as the sessionID are then received through the m_axis_rx_data_rsp_metadata and m_axis_rx_data_rsp interface.

Interface definition in HLS:

struct appReadRequest {
	ap_uint<16> sessionID;
	ap_uint<16> length;
};

hls::stream<appReadRequest>& rxDataReq,
hls::stream<ap_uint<16> >& rxDataRspMeta,
hls::stream<net_axis<WIDTH> >& rxDataRsp,

Waveform of receiving a (data) notification, requesting data, and receiving the data:

signal tcp-rx-handshake

Transmitting data

When an application wants to transmit data on a TCP connection, it first has to check if enough buffer space is available. This check/request is done through the s_axis_tx_data_req_metadata interface. If the response through the m_axis_tx_data_rsp interface from the TCP stack is positive. The application can send the data through the s_axis_tx_data_req interface. If the response from the TCP stack is negative the application can retry by sending another request on the s_axis_tx_data_req_metadata interface.

Interface definition in HLS:

struct appTxMeta {
	ap_uint<16> sessionID;
	ap_uint<16> length;
};
struct appTxRsp {
	ap_uint<16> sessionID;
	ap_uint<16> length;
	ap_uint<30> remaining_space;
	ap_uint<2>  error;
};

hls::stream<appTxMeta>& txDataReqMeta,
hls::stream<appTxRsp>& txDataRsp,
hls::stream<net_axis<WIDTH> >& txDataReq,

Waveform of requesting a data transmit and transmitting the data. signal tcp-tx-handshake

RoCE (RDMA over Converged Ethernet)

Load Queue Pair (QP)

Before any RDMA operations can be executed the Queue Pairs have to established out-of-band (e.g. over TCP/IP) by the hosts. The host can the load the QP into the RoCE stack through the s_axis_qp_interface and s_axis_qp_conn_interface interface.

Interface definition in HLS:

typedef enum {RESET, INIT, READY_RECV, READY_SEND, SQ_ERROR, ERROR} qpState;

struct qpContext {
	qpState		newState;
	ap_uint<24> qp_num;
	ap_uint<24> remote_psn;
	ap_uint<24> local_psn;
	ap_uint<16> r_key;
	ap_uint<48> virtual_address;
};
struct ifConnReq {
	ap_uint<16> qpn;
	ap_uint<24> remote_qpn;
	ap_uint<128> remote_ip_address;
	ap_uint<16> remote_udp_port;
};

hls::stream<qpContext>&	s_axis_qp_interface,
hls::stream<ifConnReq>&	s_axis_qp_conn_interface,

Issue RDMA commands

RDMA commands can be issued to RoCE stack through the s_axis_tx_meta interface. In case the commands transmits data. This data can be either originate from the host memory as specified by the local_vaddr or can originate from the application on the FPGA. In the latter case the local_vaddr is set to 0 and the data is provided through the s_axis_tx_data interface.

Interface definition in HLS:

typedef enum {APP_READ, APP_WRITE, APP_PART, APP_POINTER, APP_READ_CONSISTENT} appOpCode;

struct txMeta {
	appOpCode 	op_code;
	ap_uint<24> qpn;
	ap_uint<48> local_vaddr;
	ap_uint<48> remote_vaddr;
	ap_uint<32> length;
};
hls::stream<txMeta>& s_axis_tx_meta,
hls::stream<net_axis<WIDTH> >& s_axis_tx_data,

Waveform of issuing a RDMA read request: signal roce-read-handshake

Waveform of issuing an RDMA write request where data on the FPGA is transmitted: signal roce-write-handshake

Publications

  • D. Sidler, G. Alonso, M. Blott, K. Karras et al., Scalable 10Gbps TCP/IP Stack Architecture for Reconfigurable Hardware, in FCCM’15, Paper, Slides

  • D. Sidler, Z. Istvan, G. Alonso, Low-Latency TCP/IP Stack for Data Center Applications, in FPL'16, Paper

  • D. Sidler, Z. Wang, M. Chiosa, A. Kulkarni, G. Alonso, StRoM: smart remote memory, in EuroSys'20, Paper

Citations

If you use the TCP/IP or RDMA stacks in your project please cite one of the following papers and/or link to the github project:

@inproceedings{DBLP:conf/fccm/SidlerABKVC15,
  author       = {David Sidler and
                  Gustavo Alonso and
                  Michaela Blott and
                  Kimon Karras and
                  Kees A. Vissers and
                  Raymond Carley},
  title        = {Scalable 10Gbps {TCP/IP} Stack Architecture for Reconfigurable Hardware},
  booktitle    = {23rd {IEEE} Annual International Symposium on Field-Programmable Custom
                  Computing Machines, {FCCM} 2015, Vancouver, BC, Canada, May 2-6, 2015},
  pages        = {36--43},
  publisher    = {{IEEE} Computer Society},
  year         = {2015},
  doi          = {10.1109/FCCM.2015.12}

@inproceedings{DBLP:conf/fpl/SidlerIA16,
  author       = {David Sidler and
                  Zsolt Istv{\'{a}}n and
                  Gustavo Alonso},
  title        = {Low-latency {TCP/IP} stack for data center applications},
  booktitle    = {26th International Conference on Field Programmable Logic and Applications,
                  {FPL} 2016, Lausanne, Switzerland, August 29 - September 2, 2016},
  pages        = {1--4},
  publisher    = {{IEEE}},
  year         = {2016},
  doi          = {10.1109/FPL.2016.7577319}
}

@inproceedings{DBLP:conf/eurosys/SidlerWCKA20,
  author       = {David Sidler and
                  Zeke Wang and
                  Monica Chiosa and
                  Amit Kulkarni and
                  Gustavo Alonso},
  title        = {StRoM: smart remote memory},
  booktitle    = {EuroSys '20: Fifteenth EuroSys Conference 2020, Heraklion, Greece,
                  April 27-30, 2020},
  pages        = {29:1--29:16},
  publisher    = {{ACM}},
  year         = {2020},
  doi          = {10.1145/3342195.3387519}
}

@PHDTHESIS{sidler2019innetworkdataprocessing,
	author = {Sidler, David},
	publisher = {ETH Zurich},
	year = {2019-09},
	copyright = {In Copyright - Non-Commercial Use Permitted},
	title = {In-Network Data Processing using FPGAs},
}

@INPROCEEDINGS{sidler2020strom,
	author = {Sidler, David and Wang, Zeke and Chiosa, Monica and Kulkarni, Amit and Alonso, Gustavo},
	booktitle = {Proceedings of the Fifteenth European Conference on Computer Systems},
	title = {StRoM: Smart Remote Memory},
	doi = {10.1145/3342195.3387519},
}

Contributors