
A network addresses analyzer for Elasticsearch

Primary LanguageJavaApache License 2.0Apache-2.0

Network Address and Path Analysis for Elasticsearch

Develop Travis Status GitHub release Pull Requests Project Issues

A set of network and path related analyzers, to better index and query network related data in Elasticsearch

Latest Elasticsearch version support: 6.2.2


  • network_address analyzer - outputs network address (IPv4/MAC) parts. For example, it'd split to 127, 0, 0 and 1.
  • partial_network_address analyzer - acts like the network_address analyzer, but will handle anything that looks like a part of a network address.
    For example, it'd split 127.0 to 127 and 0; it'll also output 127 for 127 as an input. This analyzer is more useful for query analysis, than for actual documents.
  • strict_partial_network_address analyzer - acts like the partial analyzer, but will handle anything that have at least two parts of a network address. For example, it'd split 127.0 to 127 and 0, but it will output nothing for 127 as an input.
  • full_network_address analyzer - used to search for all the network addresses inside a given document.
  • path_keywords analyzer. Splits path-like strings.
  • incremental_capture_group - A modified version of the pattern_capture token filter that also increments the tokens' position attribute and sets token offsets.


  • Download a release zip from the releases page matching your ES version
  • Run bin/elasticsearch-plugin install file://<path to zip>
  • Restart Elasticsearch


After installing, you may tell ES to use the analyzers listed above in mappings and queries. To test the analyzers, the _analyze endpoint can be used:

curl -XPOST http://<es_host>:9200/_analyze?pretty -d'
    "analyzer": "network_address",
    "text": "AA:BB:CC:DD:EE:FF"




To test the plugin, the _analyze endpoint can be used:

curl -XPOST http://localhost:9200/_analyze?pretty=1&analyzer=network_address -d '
computer_name = "ABC"                                                   
mac_address = AA:BB:CC:DD:EE:FF                                         
SMBInfo for                                               
	os = Windows Server (R) 2008 Enterprise 6002 Service Pack 2         
	lanman = Windows Server (R) 2008 Enterprise 6.0                     
	domain = WORKGROUP                                                  
	server_time = Tue Apr 24 10:35:57 2012                              
smb_port = 445                                                          



The next step is to update the mapping of your index to use the analyzer:

mapping = {
	'[your_document_type]': {
		'properties': {
			'data': {
				"type": "multi_field",
				"fields": {
					# Normal string analysis. 
					"data": {
						'type': 'string',
						'term_vector': 'with_positions_offsets',
					# Network address custom analysis.
					"network_address": {
						"type": "string",
						"analyzer": "network_address"

es.put_mapping('[your_index]', '[your_document_type]', mapping)

Then, documents can be queried using the new analyzed field:

  "query": {
	"match": {
	  "data.network_address": {
		"query": "AA:BB:CC:DD:EE:FF",
		# 'phrase' will only match exact address (not just parts of it)
		"type": "phrase"

The partial_network_address can be used at query time, to search for partial addresses (note that this only affects the query analysis, not the document's):

  "query": {
	"match": {
	  "data.network_address": {
		"query": "AA:BB:CC",
		"analyzer": "partial_network_address",
		"type": "phrase"

The full_network_address can be used as an analyzer, to search for all the network addresses inside a given document.

network_addresses = self._es.indices.analyze(index=index_name, analyzer='full_network_address', body=document).get('tokens')


In the plugin folder:

  • Run mvn test to run the tests.
  • Run mvn package to build the jar.

For any more questions or issues, please e-mail me.

Have fun :)