share memory performance test
nmhx opened this issue · 13 comments
Hi, I do some test about share memory performance.
I write a publish demo and a subscriber demo, send data and record the run time to compare share memory performance and socket performance. but run time is almost the some.
I select different mode by edit $ROS_ETC_DIR/transport_mode.yaml
send size of each time | Number of frames | run time of share memory | run time of socket |
---|---|---|---|
30M | 1024 | 83s | 84s |
20M | 1024 | 53.5s | 53s |
15M | 1024 | 39.4s | 44.4s |
10M | 1024 | 26.4s | 26.4s |
5M | 1024 | 13s | 13s |
Could you share your source code of your test? We will double check the numbers.
I launch roscore, then launch subscriber demo and publish demo .
message type
message DemoTime {
optional uint32 id = 2;
optional bytes data = 1;
}
publish demo
//main.cc
#include "modules/demo/demo.h"
APOLLO_MAIN(::apollo::demo::Demo);
// demo.cc
#include <chrono>
#include <thread>
#include <ctime>
#include <sys/time.h>
#include "modules/demo/demo.h"
namespace apollo {
namespace demo {
using apollo::common::adapter::AdapterManager;
using apollo::demo::DemoTime;
using apollo::common::Status;
#define BUFF_SIZE 1024 * 1024 * 10
#define FRAMES_NUM 1024
void Demo::TestPublish() {
// ros::Rate rate(10);
uint32_t id = 0;
char *buf = new char[BUFF_SIZE];
memset(buf, 96, BUFF_SIZE);
struct timeval start, end;
gettimeofday(&start, NULL);
DemoTime demo_time;
while (ros::ok() && id < FRAMES_NUM) {
demo_time.set_id(id);
demo_time.set_data(std::string(buf));
AdapterManager::PublishDemoTime(demo_time);
ros::spinOnce();
// rate.sleep();
id++;
}
delete [] buf;
gettimeofday(&end, NULL);
float time = 1000*(end.tv_sec-start.tv_sec)+(end.tv_usec-start.tv_usec)/1000;
std::cout << "publish end: time = " << time << std::endl;
}
std::string Demo::Name() const {return "demo"; }
Status Demo::Init() {
std::cout << FLAGS_adapter_config_path << std::endl;
AdapterManager::Init(FLAGS_adapter_config_path);
return Status::OK();
}
Status Demo::Start() {
std::thread test(&Demo::TestPublish, this);
test.detach();
return Status::OK();
}
void Demo::Stop() {
timer_.stop();
}
}
}
// demo.h
#ifndef __DEMO_H_
#define __DEMO_H_
#include "modules/common/apollo_app.h"
#include "modules/common/macro.h"
#include "modules/demo/proto/demotime.pb.h"
#include "gflags/gflags.h"
#include "modules/common/adapters/adapter_gflags.h"
#include "modules/common/adapters/adapter_manager.h"
#include "ros/include/ros/ros.h"
#include "modules/common/monitor/monitor.h"
namespace apollo {
namespace demo {
class Demo :public apollo::common::ApolloApp {
public:
Demo():monitor_(apollo::common::monitor::MonitorMessageItem::CONTROL){}
std::string Name() const override;
apollo::common::Status Init() override;
apollo::common::Status Start() override;
void Stop() override;
virtual ~Demo() = default;
private:
void PublishDemoTime();
void TestPublish();
void OnTimer(const ros::TimerEvent &event);
ros::Timer timer_;
apollo::common::monitor::Monitor monitor_;
};
}
}
#endif
subscriber demo
#include <iostream>
#include <thread>
#include <chrono>
#include <ctime>
#include <signal.h>
#include "modules/demo/proto/demotime.pb.h"
#include "gflags/gflags.h"
#include "modules/common/adapters/adapter_gflags.h"
#include "modules/common/adapters/adapter_manager.h"
#include "ros/include/ros/ros.h"
DEFINE_string(node_name, "DemoTime", "The demo module name in proto");
#define MAX_ID (1024 -1)
const std::string name = "testdemo";
using apollo::common::adapter::AdapterManager;
using apollo::demo::DemoTime;
struct timeval start, end;
void TestReceive(const apollo::demo::DemoTime &message) {
static uint32_t id = 0;
if ((message.id() -id) > 1 && id != 0) {
std::cout << "..............subscriber lose " << id << "..........." << std::endl;
}
id = message.id();
if (MAX_ID == id) {
gettimeofday(&end, NULL);
float time = 1000*(end.tv_sec-start.tv_sec)+(end.tv_usec-start.tv_usec)/1000;
std::cout << name << "end: time = " << time << std::endl;
}
//std::cout << message.id() << std::endl;
}
int main(int argc, char **argv) {
google::InitGoogleLogging(argv[0]);
google::ParseCommandLineFlags(&argc, &argv, true);
ros::init(argc, argv, name);
AdapterManager::Init(FLAGS_adapter_config_path);
gettimeofday(&start, NULL);
AdapterManager::SetDemoTimeCallback(&TestReceive);
ros::spin();
return 0;
}
Shared memory based communication is to improve the efficiency of message transmission, which is the time-consuming for the message from publisher sending to subscriber receiving.
In the testcase you give, the publisher statistic is the time it takes to send 1024 frames messages, and the subscriber statistic is the time it takes to consume 1024 frames messages.
In order to compare the performance of shared memory based and socket based communications, it is recommended to use the testcase apollo-platform provides, which can maximize the elimination of other factors on the test results.
For your reference, the following is the source code for performance testing, based on apollo-platform official communication examples, please contact us if you have any questions.
apollo-platform official communication examples location:
apollo-platform/ros/ros_tutorials/roscpp_tutorials
msg type (new file)
// apollo-platform/ros/ros_tutorials/roscpp_tutorials/msg/perf.msg
uint32 id
string data
uint64 time
talker (modified)
// apollo-platform/ros/ros_tutorials/roscpp_tutorials/talker/talker.cpp
#include "ros/ros.h"
#include <sys/time.h>
#include "std_msgs/String.h"
#include "roscpp_tutorials/perf.h"
#include <sstream>
#define BUFF_SIZE 1024 * 1024 * 10
#define FRAMES_NUM 1024
int main(int argc, char **argv)
{
ros::init(argc, argv, "talker");
ros::NodeHandle n;
ros::Publisher chatter_pub = n.advertise<roscpp_tutorials::perf>("chatter", 1000);
// ros::Rate loop_rate(10);
roscpp_tutorials::perf iperf;
char *buf = new char[BUFF_SIZE];
memset(buf, 96, BUFF_SIZE);
struct timeval start;
int count = 0;
while (ros::ok() && count < FRAMES_NUM)
{
iperf.id = count;
iperf.data = buf;
gettimeofday(&start, NULL);
iperf.time = start.tv_sec * 1000 + start.tv_usec / 1000;
chatter_pub.publish(iperf);
// ros::spinOnce();
// loop_rate.sleep();
++count;
}
delete [] buf;
std::cout << "publish end!" << std::endl;
return 0;
}
listener (modified)
// apollo-platform/ros/ros_tutorials/roscpp_tutorials/listener/listener.cpp
#include "std_msgs/String.h"
#include "roscpp_tutorials/perf.h"
#define MAX_ID (1024 -1)
struct timeval end;
int64_t msg_count = 0;
uint64_t avg_time = 0;
void chatterCallback(const roscpp_tutorials::perf message)
{
++msg_count;
gettimeofday(&end, NULL);
if (avg_time == 0) {
avg_time = (end.tv_sec * 1000 + end.tv_usec / 1000) - message.time;
} else {
avg_time = (((end.tv_sec * 1000 + end.tv_usec / 1000) - message.time) + avg_time * (msg_count - 1)) / msg_count;
}
static uint32_t id = 0;
if ((message.id - id) > 1 && id != 0) {
std::cout << "..............subscriber lose " << id << "..........." << std::endl;
}
id = message.id;
if (id > 1000) {
std::cout << " transport avg time: " << avg_time << std::endl;
}
}
int main(int argc, char **argv)
{
ros::init(argc, argv, "listener");
ros::NodeHandle n;
ros::Subscriber sub = n.subscribe("chatter", 1000, chatterCallback);
ros::spin();
return 0;
}
After recompiling the apollo-platform (bash build.sh build), you can launch roscore, then launch subscriber demo and publish demo, thank you.
I use the examples apollo-platform provides to test performance, share memory base run time is longer then socket base, but socket base is easy to lose a litter frames
Could you tell me about the different share memory base performance and socket base performance?
Using the example above, my test results are as follows:
send size of each time | Number of frames | transport time of share memory | transport time of socket |
---|---|---|---|
30M | 1024 | 17ms | 1418ms |
20M | 1024 | 11ms | 262ms |
15M | 1024 | 8ms | 99ms |
10M | 1024 | 6ms | 17ms |
5M | 1024 | 3ms | 6ms |
By the way, my test environment is:
1、4.2.0-27-generic #32~14.04.1-Ubuntu
2、16 Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz
3、MemTotal: 32GB
my test environment is:
- ubuntu 14.04
- Intel® Core™ i5-6200U CPU @ 2.30GHz × 4
- MemTotal: 8GB
I test the demo on docker
Do you have a test on PC with low specifications?
I build and run demo outside of docker, I get the transport time.
data | share memory | socket |
---|---|---|
5m | 23ms | 26ms |
10m | 47ms | 90ms |
15m | 71ms | 1461ms |
20m | 95ms | 4570ms |
but I do the test on docker and get the transport time. I edit $ROS_ETC_DIR/transort_mode.yaml
data | share memory | socket |
---|---|---|
5m | 21ms | 23ms |
10m | 45ms | 46ms |
15m | 69ms | 69ms |
20m | 92ms | 92ms |
Would you pack the complete test program and provide for us? my email address is "bjtulynn@163.com". If convenient, please tell us the detailed test process, thank you!
Close for now. Let me know if you are still having questions, you can reopen this issue anytime.