morganstanley/binlog

[INVALID] a crash from binlog

Closed this issue · 8 comments

image

Hi,
I got this crash when I try to use writer & session with TSC.
My implementation of binlog as below

class BinLoggerBinary
{
public:
	BinLoggerBinary(std::ostream& binary) :_binary(binary),
		_severity(binlog::Severity::trace),
		_writer(_session),
		_filter([this](const binlog::EventSource& source) {
		return source.severity >= this->_severity;
			})
	{
		// Create a ClockSync, that connects the TSC value to the wall clock time.
		const auto tscValue = __rdtsc();
		const binlog::ClockSync systemClockSync = binlog::systemClockSync();
		const binlog::ClockSync tscSync{
			tscValue,
			tscFrequency(),
			systemClockSync.nsSinceEpoch,
			systemClockSync.tzOffset,
			systemClockSync.tzName
		};

		// Add the created ClockSync to the metadata of the session.
		// Events consumed after this are assumed to be timestamped
		// with the clock defined by the ClockSync (this case, TSC)
		_session.setClockSync(tscSync);
	}
	std::uint64_t tscFrequency()
	{
#if defined(__linux__)
		long freq;

		// If the kernel is exporting the tsc frequency use that. There are issues
		// where cpuinfo_max_freq cannot be relied on because the BIOS may be
		// exporintg an invalid p-state (on x86) or p-states may be used to put the
		// processor in a new mode (turbo mode). Essentially, those frequencies
		// cannot always be relied upon. The same reasons apply to /proc/cpuinfo as
		// well.
		if (ReadFromFile("/sys/devices/system/cpu/cpu0/tsc_freq_khz", &freq)
			// If CPU scaling is in effect, we want to use the *maximum* frequency,
			// not whatever CPU speed some random processor happens to be using now.
			|| ReadFromFile("/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq",
				&freq)) {
			// The value is in kHz (as the file name suggests).  For example, on a
			// 2GHz warpstation, the file contains the value "2000000".
			return freq * 1000.0;
		}
		else
		{
			throw std::runtime_error("Not found tsc freq in this OS,make sure cpuinfo_max_freq exist.");
			return 0;
		}
#else
		// In NT, read MHz from the registry. If we fail to do so or we're in win9x
		// then make a crude estimate.
		DWORD data, data_size = sizeof(data);
		if (IsWindowsXPOrGreater() &&
			SUCCEEDED(
				SHGetValueA(HKEY_LOCAL_MACHINE, "HARDWARE\\DESCRIPTION\\System\\CentralProcessor\\0", "~MHz", nullptr, &data, &data_size)))
			return ((uint64_t)data *
				(uint64_t)(1000 * 1000));  // was mhz
		else
		{
			throw std::runtime_error("Not found tsc freq in this OS,make sure HARDWARE\\DESCRIPTION\\System\\CentralProcessor\\0 exist.");
			return 0;
		}
#endif
	}
	void consume()
	{
		_session.consume(_binary);
		_binary.flush();
	}

	binlog::SessionWriter& GetWriter()
	{
		return _writer;
	}

	void set_log_level(binlog::Severity level)
	{
		_severity = level;
	}
private:		
	std::ostream& _binary;
	binlog::Session _session;
	binlog::Severity _severity;
	binlog::SessionWriter _writer;		
	binlog::EventFilter _filter;		
};

image

excuse me, can you see this png of my crashed call-stack?

Hi, How do you do the actual logging? In the callstack, the QueueWriter (contained by the SessionWriter) seems to be invalid? (_queue=???, _writePos=???, ...) it seems to be the reason why the memcpy fails. Try to figure out when (and why) the queue writer dies.

Hi, How do you do the actual logging? In the callstack, the QueueWriter (contained by the SessionWriter) seems to be invalid? (_queue=???, _writePos=???, ...) it seems to be the reason why the memcpy fails. Try to figure out when (and why) the queue writer dies.

Attached is my binlog wrapper as a lib input to my project.

In my project, I used the following code to logging in a critical path of a network function:
TSC_ERROR_W(logger->GetWriter(), "isSendingIdle TRUE,thread:{},ssid:{}", threadId, _sessionID);

In main() function, I will create a binlog as below:
Util::LogManager logManager;
logger = logManager.CreateLogger("clogger", "QuoteSvr");

If I logging in an uncritical function, then program is ok. no crash.

LogManager.zip

Is your application single threaded? If not, how do you make sure only a single thread accesses logger->GetWriter() at a time? (Btw, you don't need to log threadId, the writer can keep track of that in a cheaper way: http://binlog.org/UserGuide.html#named-writers)

Is your application single threaded? If not, how do you make sure only a single thread accesses logger->GetWriter() at a time? (Btw, you don't need to log threadId, the writer can keep track of that in a cheaper way: http://binlog.org/UserGuide.html#named-writers)

Thank you, My app is multi-threaded, Should I add a lock_guard() in GetWriter() to make sure thread-safe? or should I add multiple writers in my binlog wrappers?

Is your application single threaded? If not, how do you make sure only a single thread accesses logger->GetWriter() at a time? (Btw, you don't need to log threadId, the writer can keep track of that in a cheaper way: http://binlog.org/UserGuide.html#named-writers)

Thank you, My app is multi-threaded, Should I add a lock_guard() in GetWriter() to make sure thread-safe? or should I add multiple writers in my binlog wrappers?
Do you have an example of creating multiple writers in multithreaded program?

You'll need multiple writers, one per thread. Adding a lock_guard would essentially serialize your worker threads, greatly reducing performance.