yokawasa/fluent-plugin-azure-loganalytics

Fails to Push to Azue After Period of Time

Closed this issue · 4 comments

Hello, I seem to be running into an issue which I believe maybe related to the Azure LogAnalytics plugin I am using. It seems after a period of time I will start receiving the following errors in my Fluentd

{"time":"FT05:21:05+00:00","level":"fatal","message":"Exception occured in posting to DataCollector API: ","worker_id":0}

At that point it seems that logs are no longer pushed to Azure Log Analytics. I am curious as to if you have seen this issue before and know of a fix? Appreciate any help you can provide. At this point I simply have to constantly restart fluentd to get it back to pushing logs again to azure.

fluentd 0.14.21

@kkniffin Sorry for the delay. I haven't seen the issue before. Can you please share a bit more about your environment info? - OS, ruby version, fluentd config ?

Sure here you go. It's odd, it will post to the azure log analytics for a period of time, but then it just crashes with that error and never seems to recover. This one source and config posts a significant amount of data to azure log analytics whereas I have another one that does not and that one seems to work fine. Is there anyway that if the plugin fails for whatever reason to have it attempt to reconnect and recover?

OS: alpine release 3.5.2 in docker container
Ruby Version: ruby 2.3.5p376 (2017-09-14 revision 59905) [x86_64-linux-musl]
FluentD Version: fluentd 0.14.21
FluentD Config:

<system>
  log_level error
  <log>
    # text or json. default is text
    format json
    # Change format of log time. This affects both text and json.
    time_format FT%T%:z
  </log>
</system>

####################
###### SOURCES #####
####################

<source>
	@type syslog
	port 515
	bind 0.0.0.0
	tag raw.syslog
	include_source_host true
	message_format rfc3164
</source>

#####################
##### MANIPULATE ####
#####################

#### Add Server Received Time to Records
<filter raw.syslog.**>
        @type record_modifier
        <record>
                SyslogReceived ${Time.at(time).to_s}
        </record>
</filter>

<match raw.syslog.**>
        @type rewrite_tag_filter
	<rule>
		# Classify Palo Alto Traffic Log
		key message 
		pattern ^.*,TRAFFIC,.*$ 
		tag paloalto.traffic 
	</rule>
	<rule>
		# Classify Palo Alto Threat Traffic Log
		key message 
		pattern ^.*,THREAT,.*$ 
		tag paloalto.threat 
	</rule>
	<rule>
		# Classify Palo Alto Config Log
		key message 
		pattern ^.*,CONFIG,.*$ 
		tag paloalto.config 
	</rule>
	<rule>
		# Classify Palo Alto System Log
		key message 
		pattern ^.*,SYSTEM,.*$ 
		tag paloalto.system 
	</rule>
	<rule>
		# Classify Palo Alto HIP Match Log
		key message 
		pattern ^.*,HIP-MATCH,.*$ 
		tag paloalto.hipmatch 
	</rule>	
	<rule>
		# Classify Wildfire
		key message 
		pattern ^.*,WILDFIRE,.*$ 
		tag paloalto.wildfire 
	</rule>
	<rule>
		# Classify Auth Logs
		key message 
		pattern ^.*,AUTH,.*$ 
		tag paloalto.auth 
	</rule>
	<rule>
		# Classify UserID Logs
		key message 
		pattern ^.*,USERID,.*$ 
		tag paloalto.userid 
	</rule>
	<rule>
		# Classify everything else unmatched
		key message 
		pattern .+ 
		tag paloalto.unmatched 
	</rule>
</match>

#### Add Type to Record
<filter paloalto.**>
        @type record_modifier
        <record>
                logtype ${tag_parts[0]}-${tag_parts[1]}
        </record>
</filter>

#### Parse out Palo Alto Traffic Logs
<filter paloalto.traffic>
        @type parser
        key_name message
        reserve_data true
	format /^(?<ReceiveTime>.*?),(?<SerialNumber>.*?),(?<Type>.*?),(?<SubType>.*?),(?<FutureUse1>.*?),(?<GeneratedTime>.*?),(?<SourceIP>.*?),(?<DestinationIP>.*?),(?<NATSourceIP>.*?),(?<NATDestinationIP>.*?),(?<RuleName>.*?),(?<SourceUser>.*?),(?<DestinationUser>.*?),(?<Application>.*?),(?<VirtualSystem>.*?),(?<SourceZone>.*?),(?<DestinationZone>.*?),(?<IngressInterface>.*?),(?<EgressInterface>.*?),(?<LogForwardingProfile>.*?),(?<FutureUse2>.*?),(?<SessionID>.*?),(?<RepeatCount>.*?),(?<SourcePort>.*?),(?<DestinationPort>.*?),(?<NATSourcePort>.*?),(?<NATDestinationPort>.*?),(?<Flags>.*?),(?<Protocol>.*?),(?<Action>.*?),(?<Bytes>.*?),(?<BytesSent>.*?),(?<BytesReceived>.*?),(?<Packets>.*?),(?<StartTime>.*?),(?<ElapsedTime>.*?),(?<Category>.*?),(?<FutureUse3>.*?),(?<SequenceNumber>.*?),(?<ActionFlags>.*?),(?<SourceLocation>.*?),(?<DestinationLocation>.*?),(?<FutureUse4>.*?),(?<PacketsSent>.*?),(?<PacketsReceived>.*?),(?<SessionEndReason>.*?),(?<DeviceGroupHiearchy1>.*?),(?<DeviceGroupHiearchy2>.*?),(?<DeviceGroupHiearchy3>.*?),(?<DeviceGroupHiearchy4>.*?),(?<VirtualSystemName>.*?),(?<DeviceName>.*?),(?<ActionSource>.*?),(?<SourceVMUUID>.*?),(?<DestinationVMUUID>.*?),(?<TunnelID>.*?),(?<MonitorTag>.*?),(?<ParentSessionID>.*?),(?<ParentStartTime>.*?),(?<TunnelType>.*?)$/
        suppress_parse_error_log true
#	emit_invalid_record_to_error true
	replace_invalid_sequence true
</filter>

#### Parse out Palo Alto System Logs
<filter paloalto.system>
	@type parser
	key_name message
	reserve_data yes
	format /^(?<ReceiveTime>.*?),(?<SerialNumber>.*?),(?<Type>.*?),(?<ThreatContentType>.*?),(?<FutureUse1>.*?),(?<GeneratedTime>.*?),(?<VirtualSystem>.*?),(?<EventID>.*?),(?<Object>.*?),(?<FutureUse2>.*?),(?<FutureUse3>.*?),(?<Module>.*?),(?<Severity>.*?),(?<Description>".*?"|.*?),(?<SequenceNumber>.*?),(?<ActionFlags>.*?),(?<DeviceGroupHiearchy1>.*?),(?<DeviceGroupHiearchy2>.*?),(?<DeviceGroupHiearchy3>.*?),(?<DeviceGroupHiearchy4>.*?),(?<VirtualSystemName>.*?),(?<DeviceName>.*?)$/
        suppress_parse_error_log true
#	 emit_invalid_record_to_error true
        replace_invalid_sequence true
</filter>

#### Parse out Palo Alto USER ID Logs
<filter paloalto.userid>
        @type parser
        key_name message
        reserve_data yes
	format /^(?<ReceiveTime>.*?),(?<SerialNumber>.*?),(?<SequenceNumber>.*?),(?<ActionFlags>.*?),(?<Type>.*?),(?<ThreatContentType>.*?),(?<FutureUse1>.*?),(?<GeneratedTime>.*?),(?<DeviceGroupHiearchy1>.*?),(?<DeviceGroupHiearchy2>.*?),(?<DeviceGroupHiearchy3>.*?),(?<DeviceGroupHiearchy4>.*?),(?<VirtualSystemName>.*?),(?<DeviceName>.*?),(?<VirtualSystemID>.*?),(?<VirtualSystem>.*?),(?<SourceIP>.*?),(?<User>.*?),(?<DataSourceName>.*?),(?<EventID>.*?),(?<RepeatCount>.*?),(?<TimeOutThreshold>.*?),(?<SourcePort>.*?),(?<DestinationPort>.*?),(?<DataSource>.*?),(?<DataSourceType>.*?),(?<FutureUse2>.*?),(?<FutureUse3>.*?),(?<FactorType>.*?),(?<FactorCompletionTime>.*?),(?<FactorNumber>.*?)$/
        suppress_parse_error_log true
#        emit_invalid_record_to_error true
        replace_invalid_sequence true
</filter>

#### Parse out Palo Threat Logs
<filter paloalto.threat>
        @type parser
        key_name message
        reserve_data yes
	format /^(?<ReceiveTime>.*?),(?<SerialNumber>.*?),(?<Type>.*?),(?<ThreatContentType>.*?),(?<FutureUse1>.*?),(?<GeneratedTime>.*?),(?<SourceIP>.*?),(?<DestinationIP>.*?),(?<NATSourceIP>.*?),(?<NATDestIP>.*?),(?<RuleName>.*?),(?<SourceUser>.*?),(?<DestinationUser>.*?),(?<Application>.*?),(?<VirtualSystem>.*?),(?<SourceZone>.*?),(?<DestinationZone>.*?),(?<InboundInterface>.*?),(?<OutboundInterface>.*?),(?<LogForwardingProfile>.*?),(?<FutureUse2>.*?),(?<SessionID>.*?),(?<RepeatCount>.*?),(?<SourcePort>.*?),(?<DestPort>.*?),(?<NATSourcePort>.*?),(?<NATDestPort>.*?),(?<Flags>.*?),(?<Protocol>.*?),(?<Action>.*?),(?<Misc>".*?"|.*?),(?<ThreatID>.*?),(?<Category>.*?),(?<Severity>.*?),(?<Direction>.*?),(?<SequenceNumber>.*?),(?<ActionFlags>.*?),(?<SourceLocation>.*?),(?<DestLocation>.*?),(?<FutureUse3>.*?)(?<ContentType>.*?),(?<PCAPID>.*?),(?<FileDigest>.*?),(?<Cloud>.*?),(?<URL>.*?),(?<Index>.*?),(?<UserAgent>.*?),(?<FileType>.*?),(?<XForwardingFor>.*?),(?<Referer>.*?),(?<Sender>.*?),(?<Subject>.*?),(?<Recipient>.*?),(?<ReportID>.*?),(?<DeviceGroupHiearchy1>.*?),(?<DeviceGroupHiearchy2>.*?),(?<DeviceGroupHiearchy3>.*?),(?<DeviceGroupHiearchy4>.*?),(?<VirtualSysName>.*?),(?<DeviceName>.*?),(?<FutureUse4>.*?),(?<SourceVMUUID>.*?),(?<DestVMUUID>.*?),(?<HTTPMethod>.*?),(?<TunnelID>.*?),(?<MonitorTag>.*?),(?<ParentSessionID>.*?),(?<ParentStartTime>.*?),(?<TunnelType>.*?),(?<ThreatCategory>.*?),(?<ContentVersion>.*?),(?<FutureUse5>.*?)$/
        suppress_parse_error_log true
#	 emit_invalid_record_to_error true
        replace_invalid_sequence true
</filter>

#### Parse out Palo Config Logs
<filter paloalto.config>
        @type parser
        key_name message
        reserve_data yes
	format /^(?<ReceiveTime>.*?),(?<SerialNumber>.*?),(?<Type>.*?),(?<SubType>.*?),(?<FutureUse1>.*?),(?<GeneratedTime>.*?),(?<Host>.*?),(?<VirtualSystem>.*?),(?<Command>.*?),(?<Admin>.*?),(?<Client>.*?),(?<Result>.*?),(?<ConfigurationPath>.*?),(?<SequenceNumber>.*?),(?<ActionFlags>.*?),(?<BeforeChangeDetail>.*?),?(?<AfterChangeDetail>.*?),?(?<DeviceGroupHiearchy1>.*?),(?<DeviceGroupHiearchy2>.*?),(?<DeviceGroupHiearchy3>.*?),(?<DeviceGroupHiearchy4>.*?),(?<VirtualSystemName>.*?),(?<DeviceName>.*?)$/
        suppress_parse_error_log true
#        emit_invalid_record_to_error true
        replace_invalid_sequence true
</filter>

#####################
###### OUTPUTS ######
#####################

<label @ERROR>
        <match **>
		@type stdout
        </match>
</label>

<match paloalto.threat>
	@type copy
	<store>
		@type azure-loganalytics
		customer_id "#{ENV['AZURE_CUSTOMERID']}"
	        shared_key "#{ENV['AZURE_SHAREDKEY']}"
	        log_type PaloAltoThreat
	        add_time_field true
	        time_field_name LogSentTime
	        time_format %s
	        localtime true
	        add_tag_field true
	        tag_field_name paloalto_threat
	</store>
	#<store>
	#	@type stdout
	#</store>
</match>

<match paloalto.config>
	@type azure-loganalytics
        customer_id "#{ENV['AZURE_CUSTOMERID']}"
        shared_key "#{ENV['AZURE_SHAREDKEY']}"
        log_type PaloAltoConfig
        add_time_field true
        time_field_name LogSentTime
        time_format %s
        localtime true
        add_tag_field true
        tag_field_name paloalto_config
</match>

<match paloalto.system>
	@type copy
	<store>
		@type azure-loganalytics
                customer_id "#{ENV['AZURE_CUSTOMERID']}"
                shared_key "#{ENV['AZURE_SHAREDKEY']}"
                log_type PaloAltoSystem
                add_time_field true
                time_field_name LogSentTime
                time_format %s
                localtime true
                add_tag_field true
                tag_field_name paloalto_system
        </store>
	#<store>
	#	@type stdout
	#</store>
</match>

<match paloalto.traffic>
	@type azure-loganalytics
        customer_id "#{ENV['AZURE_CUSTOMERID']}"
        shared_key "#{ENV['AZURE_SHAREDKEY']}"
        log_type PaloAltoTraffic
        add_time_field true
        time_field_name LogSentTime
        time_format %s
        localtime true
        add_tag_field true
        tag_field_name paloalto_traffic
</match>

<match paloalto.userid>
	@type azure-loganalytics
        customer_id "#{ENV['AZURE_CUSTOMERID']}"
        shared_key "#{ENV['AZURE_SHAREDKEY']}"
        log_type PaloAltoUserID
        add_time_field true
        time_field_name LogSentTime
        time_format %s
        localtime true
        add_tag_field true
        tag_field_name paloalto_userid
</match>

<match paloalto.unmatched>
	@type copy
        <store>
		@type azure-loganalytics
                customer_id "#{ENV['AZURE_CUSTOMERID']}"
                shared_key "#{ENV['AZURE_SHAREDKEY']}"
                log_type PaloAltoUnmatched
                add_time_field true
                time_field_name LogSentTime
                time_format %s
                localtime true
                add_tag_field true
                tag_field_name paloalto_unmatched
		log_level debug
	</store>
        #<store>
	#	@type stdout
        #</store>
</match>

@kkniffin I'd like to restart this if you are still seeing the same error. Would you please share the current status of the issue?

I'm closing this but please feel free to open the issue if needed