WARNING: This is the master branch. The current release v1.0.0-beta2 can be found here.

Elastic Common Schema (ECS)

The Elastic Common Schema (ECS) defines a common set of fields for ingesting data into Elasticsearch. A common schema helps you correlate data from sources like logs and metrics or IT operations analytics and security analytics.

ECS is still under development and backward compatibility is not guaranteed. Any feedback on the general structure, missing fields, or existing fields is appreciated. For contributions please read the Contributing Guide.

Versions

The master branch of this repository should never be considered an official release of ECS. You can browse official releases of ECS here.

Please note that when the README.md file and other generated files (like schema.csv and template.json) are not in agreement, the README.md should be considered the official spec. The other two files are simply provided as a convenience, and may not always be fully up to date.

In this readme

Fields
Use cases
Implementing ECS
FAQ

Fields

ECS defines these fields.

Base fields
Agent fields
Client fields
Cloud fields
Container fields
Destination fields
ECS fields
Error fields
Event fields
File fields
Geo fields
Group fields
Host fields
HTTP fields
Log fields
Network fields
Observer fields
Organization fields
Operating System fields
Process fields
Related fields
Server fields
Service fields
Source fields
URL fields
User fields
User agent fields

Base fields

The base set contains all fields which are on the top level. These fields are common across all types of events.

Field	Description	Level	Type	Example
@timestamp	Date/time when the event originated. For log events this is the date/time when the event was generated, and not when it was read. Required field for all events.	core	date	`2016-05-23T08:05:34.853Z`
tags	List of keywords used to tag each event.	core	keyword	`["production", "env2"]`
labels	Key/value pairs. Can be used to add meta information to events. Should not contain nested objects. All values are stored as keyword. Example: `docker` and `k8s` labels.	core	object	`{'application': 'foo-bar', 'env': 'production'}`
message	For log events the message field contains the log message. In other use cases the message field can be used to concatenate different values which are then freely searchable. If multiple messages exist, they can be combined into one message.	core	text	`Hello World`

Agent fields

The agent fields contain the data about the software entity, if any, that collects, detects, or observes events on a host, or takes measurements on a host. Examples include Beats. Agents may also run on observers. ECS agent.* fields shall be populated with details of the agent running on the host or observer where the event happened or the measurement was taken.

Field	Description	Level	Type	Example
agent.version	Version of the agent.	core	keyword	`6.0.0-rc2`
agent.name	Name of the agent. This is a name that can be given to an agent. This can be helpful if for example two Filebeat instances are running on the same host but a human readable separation is needed on which Filebeat instance data is coming from. If no name is given, the name is often left empty.	core	keyword	`foo`
agent.type	Type of the agent. The agent type stays always the same and should be given by the agent used. In case of Filebeat the agent would always be Filebeat also if two Filebeat instances are run on the same machine.	core	keyword	`filebeat`
agent.id	Unique identifier of this agent (if one exists). Example: For Beats this would be beat.id.	core	keyword	`8a4f500d`
agent.ephemeral_id	Ephemeral identifier of this agent (if one exists). This id normally changes across restarts, but `agent.id` does not.	extended	keyword	`8a4f500f`

Examples: In the case of Beats for logs, the agent.name is filebeat. For APM, it is the agent running in the app/service. The agent information does not change if data is sent through queuing systems like Kafka, Redis, or processing systems such as Logstash or APM Server.

Client fields

A client is defined as the initiator of a network connection for events regarding sessions, connections, or bidirectional flow records. For TCP events, the client is the initiator of the TCP connection that sends the SYN packet(s). For other protocols, the client is generally the initiator or requestor in the network transaction. Some systems use the term "originator" to refer the client in TCP connections. The client fields describe details about the system acting as the client in the network event. Client fields are usually populated in conjunction with server fields. Client fields are generally not populated for packet-level events.

Client / server representations can add semantic context to an exchange, which is helpful to visualize the data in certain situations. If your context falls in that category, you should still ensure that source and destination are filled appropriately.

Field	Description	Level	Type	Example
client.address	Some event client addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. Then it should be duplicated to `.ip` or `.domain`, depending on which one it is.	extended	keyword
client.ip	IP address of the client. Can be one or multiple IPv4 or IPv6 addresses.	core	ip
client.port	Port of the client.	core	long
client.mac	MAC address of the client.	core	keyword
client.domain	Client domain.	core	keyword
client.bytes	Bytes sent from the client to the server.	core	long	`184`
client.packets	Packets sent from the client to the server.	core	long	`12`

Cloud fields

Fields related to the cloud or infrastructure the events are coming from.

Field	Description	Level	Type	Example
cloud.provider	Name of the cloud provider. Example values are ec2, gce, or digitalocean.	extended	keyword	`ec2`
cloud.availability_zone	Availability zone in which this host is running.	extended	keyword	`us-east-1c`
cloud.region	Region in which this host is running.	extended	keyword	`us-east-1`
cloud.instance.id	Instance ID of the host machine.	extended	keyword	`i-1234567890abcdef0`
cloud.instance.name	Instance name of the host machine.	extended	keyword
cloud.machine.type	Machine type of the host machine.	extended	keyword	`t2.medium`
cloud.account.id	The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.	extended	keyword	`666777888999`

Examples: If Metricbeat is running on an EC2 host and fetches data from its host, the cloud info contains the data about this machine. If Metricbeat runs on a remote machine outside the cloud and fetches data from a service running in the cloud, the field contains cloud data from the machine the service is running on.

Container fields

Container fields are used for meta information about the specific container that is the source of information. These fields help correlate data based containers from any runtime.

Field	Description	Level	Type	Example
container.runtime	Runtime managing this container.	extended	keyword	`docker`
container.id	Unique container id.	core	keyword
container.image.name	Name of the image the container was built on.	extended	keyword
container.image.tag	Container image tag.	extended	keyword
container.name	Container name.	extended	keyword
container.labels	Image labels.	extended	object

Destination fields

Destination fields describe details about the destination of a packet/event. Destination fields are usually populated in conjunction with source fields.

Field	Description	Level	Type	Example
destination.address	Some event destination addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. Then it should be duplicated to `.ip` or `.domain`, depending on which one it is.	extended	keyword
destination.ip	IP address of the destination. Can be one or multiple IPv4 or IPv6 addresses.	core	ip
destination.port	Port of the destination.	core	long
destination.mac	MAC address of the destination.	core	keyword
destination.domain	Destination domain.	core	keyword
destination.bytes	Bytes sent from the destination to the source.	core	long	`184`
destination.packets	Packets sent from the destination to the source.	core	long	`12`

ECS fields

Meta-information specific to ECS.

Field	Description	Level	Type	Example
ecs.version	ECS version this event conforms to. `ecs.version` is a required field and must exist in all events. When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events. The current version is 1.0.0-beta2 .	core	keyword	`1.0.0-beta2`

Error fields

These fields can represent errors of any kind. Use them for errors that happen while fetching events or in cases where the event itself contains an error.

Field	Description	Level	Type
error.id	Unique identifier for the error.	core	keyword
error.message	Error message.	core	text
error.code	Error code describing the error.	core	keyword

Event fields

The event fields are used for context information about the log or metric event itself. A log is defined as an event containing details of something that happened. Log events must include the time at which the thing happened. Examples of log events include a process starting on a host, a network packet being sent from a source to a destination, or a network connection between a client and a server being initiated or closed. A metric is defined as an event containing one or more numerical or categorical measurements and the time at which the measurement was taken. Examples of metric events include memory pressure measured on a host, or vulnerabilities measured on a scanned host.

Field	Description	Level	Type	Example
event.id	Unique ID to describe the event.	core	keyword	`8a4f500d`
event.kind	The kind of the event. This gives information about what type of information the event contains, without being specific to the contents of the event. Examples are `event`, `state`, `alarm`. Warning: In future versions of ECS, we plan to provide a list of acceptable values for this field, please use with caution.	extended	keyword	`state`
event.category	Event category. This contains high-level information about the contents of the event. It is more generic than `event.action`, in the sense that typically a category contains multiple actions. Warning: In future versions of ECS, we plan to provide a list of acceptable values for this field, please use with caution.	core	keyword	`user-management`
event.action	The action captured by the event. This describes the information in the event. It is more specific than `event.category`. Examples are `group-add`, `process-started`, `file-created`. The value is normally defined by the implementer.	core	keyword	`user-password-change`
event.outcome	The outcome of the event. If the event describes an action, this fields contains the outcome of that action. Examples outcomes are `success` and `failure`. Warning: In future versions of ECS, we plan to provide a list of acceptable values for this field, please use with caution.	extended	keyword	`success`
event.type	Reserved for future usage. Please avoid using this field for user data.	core	keyword
event.module	Name of the module this data is coming from. This information is coming from the modules used in Beats or Logstash.	core	keyword	`mysql`
event.dataset	Name of the dataset. The concept of a `dataset` (fileset / metricset) is used in Beats as a subset of modules. It contains the information which is currently stored in metricset.name and metricset.module or fileset.name.	core	keyword	`stats`
event.severity	Severity describes the severity of the event. What the different severity values mean can very different between use cases. It's up to the implementer to make sure severities are consistent across events.	core	long	`7`
event.original	Raw text message of entire event. Used to demonstrate log integrity. This field is not indexed and doc_values are disabled. It cannot be searched, but it can be retrieved from `_source`.	core	(not indexed)	`Sep 19 08:26:10 host CEF:0\|Security\| threatmanager\|1.0\|100\| worm successfully stopped\|10\|src=10.0.0.1 dst=2.1.2.2spt=1232`
event.hash	Hash (perhaps logstash fingerprint) of raw field to be able to demonstrate log integrity.	extended	keyword	`123456789012345678901234567890ABCD`
event.duration	Duration of the event in nanoseconds. If event.start and event.end are known this value should be the difference between the end and start time.	core	long
event.timezone	This field should be populated when the event's timestamp does not include timezone information already (e.g. default Syslog timestamps). It's optional otherwise. Acceptable timezone formats are: a canonical ID (e.g. "Europe/Amsterdam"), abbreviated (e.g. "EST") or an HH:mm differential (e.g. "-05:00").	extended	keyword
event.created	event.created contains the date when the event was created. This timestamp is distinct from @timestamp in that @timestamp contains the processed timestamp. For logs these two timestamps can be different as the timestamp in the log line and when the event is read for example by Filebeat are not identical. `@timestamp` must contain the timestamp extracted from the log line, event.created when the log line is read. The same could apply to package capturing where @timestamp contains the timestamp extracted from the network package and event.created when the event was created. In case the two timestamps are identical, @timestamp should be used.	core	date
event.start	event.start contains the date when the event started or when the activity was first observed.	extended	date
event.end	event.end contains the date when the event ended or when the activity was last observed.	extended	date
event.risk_score	Risk score or priority of the event (e.g. security solutions). Use your system's original value here.	core	float
event.risk_score_norm	Normalized risk score or priority of the event, on a scale of 0 to 100. This is mainly useful if you use more than one system that assigns risk scores, and you want to see a normalized value across all systems.	extended	float

File fields

A file is defined as a set of information that has been created on, or has existed on a filesystem. File objects can be associated with host events, network events, and/or file events (e.g., those produced by File Integrity Monitoring [FIM] products or services). File fields provide details about the affected file associated with the event or metric.

Field	Description	Level	Type	Example
file.path	Path to the file.	extended	keyword
file.target_path	Target path for symlinks.	extended	keyword
file.extension	File extension. This should allow easy filtering by file extensions.	extended	keyword	`png`
file.type	File type (file, dir, or symlink).	extended	keyword
file.device	Device that is the source of the file.	extended	keyword
file.inode	Inode representing the file in the filesystem.	extended	keyword
file.uid	The user ID (UID) or security identifier (SID) of the file owner.	extended	keyword
file.owner	File owner's username.	extended	keyword
file.gid	Primary group ID (GID) of the file.	extended	keyword
file.group	Primary group name of the file.	extended	keyword
file.mode	Mode of the file in octal representation.	extended	keyword	`416`
file.size	File size in bytes (field is only added when `type` is `file`).	extended	long
file.mtime	Last time file content was modified.	extended	date
file.ctime	Last time file metadata changed.	extended	date

Geo fields

Geo fields can carry data about a specific location related to an event or geo information derived from an IP field.

The geo fields are expected to be nested at: client.geo, destination.geo, host.geo, observer.geo, server.geo, source.geo.

Note also that the geo fields are not expected to be used directly at the top level.

Field	Description	Level	Type	Example
geo.location	Longitude and latitude.	core	geo_point	`{ "lon": -73.614830, "lat": 45.505918 }`
geo.continent_name	Name of the continent.	core	keyword	`North America`
geo.country_name	Country name.	core	keyword	`Canada`
geo.region_name	Region name.	core	keyword	`Quebec`
geo.city_name	City name.	core	keyword	`Montreal`
geo.country_iso_code	Country ISO code.	core	keyword	`CA`
geo.region_iso_code	Region ISO code.	core	keyword	`CA-QC`
geo.name	User-defined description of a location, at the level of granularity they care about. Could be the name of their data centers, the floor number, if this describes a local physical entity, city names. Not typically used in automated geolocation.	extended	keyword	`boston-dc`

Group fields

The group fields are meant to represent groups that are relevant to the event.

The group fields are expected to be nested at: user.group.

Note also that the group fields may be used directly at the top level.

Field	Description	Level	Type	Example
group.id	Unique identifier for the group on the system/platform.	extended	keyword
group.name	Name of the group.	extended	keyword

Host fields

A host is defined as a general computing instance. ECS host.* fields should be populated with details about the host on which the event happened, or on which the measurement was taken. Host types include hardware, virtual machines, Docker containers, and Kubernetes nodes.

Field	Description	Level	Type	Example
host.hostname	Hostname of the host. It normally contains what the `hostname` command returns on the host machine.	core	keyword
host.name	Name of the host. It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use.	core	keyword
host.id	Unique host id. As hostname is not always unique, use values that are meaningful in your environment. Example: The current usage of `beat.name`.	core	keyword
host.ip	Host ip address.	core	ip
host.mac	Host mac address.	core	keyword
host.type	Type of host. For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment.	core	keyword
host.architecture	Operating system architecture.	core	keyword	`x86_64`

HTTP fields

Fields related to HTTP activity.

Field	Description	Level	Type	Example
http.request.method	Http request method. The field value must be normalized to lowercase for querying. See "Lowercase Capitalization" in the "Implementing ECS" section.	extended	keyword	`get, post, put`
http.request.body.content	The full http request body.	extended	keyword	`Hello world`
http.request.referrer	Referrer for this HTTP request.	extended	keyword	`https://blog.example.com/`
http.response.status_code	Http response status code.	extended	long	`404`
http.response.body.content	The full http response body.	extended	keyword	`Hello world`
http.version	Http version.	extended	keyword	`1.1`
http.request.bytes	Total size in bytes of the request (body and headers).	extended	long	`1437`
http.request.body.bytes	Size in bytes of the request body.	extended	long	`887`
http.response.bytes	Total size in bytes of the response (body and headers).	extended	long	`1437`
http.response.body.bytes	Size in bytes of the response body.	extended	long	`887`

Log fields

Fields which are specific to log events.

Field	Description	Level	Type	Example
log.level	Log level of the log event. Some examples are `WARN`, `ERR`, `INFO`.	core	keyword	`ERR`
log.original	This is the original log message and contains the full log message before splitting it up in multiple parts. In contrast to the `message` field which can contain an extracted part of the log message, this field contains the original, full log message. It can have already some modifications applied like encoding or new lines removed to clean up the log message. This field is not indexed and doc_values are disabled so it can't be queried but the value can be retrieved from `_source`.	core	(not indexed)	`Sep 19 08:26:10 localhost My log`

Network fields

The network is defined as the communication path over which a host or network event happens. The network.* fields should be populated with details about the network activity associated with an event.

Field	Description	Level	Type	Example
network.name	Name given by operators to sections of their network.	extended	keyword	`Guest Wifi`
network.type	In the OSI Model this would be the Network Layer. ipv4, ipv6, ipsec, pim, etc The field value must be normalized to lowercase for querying. See "Lowercase Capitalization" in the "Implementing ECS" section.	core	keyword	`ipv4`
network.iana_number	IANA Protocol Number (https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml). Standardized list of protocols. This aligns well with NetFlow and sFlow related logs which use the IANA Protocol Number.	extended	keyword	`6`
network.transport	Same as network.iana_number, but instead using the Keyword name of the transport layer (udp, tcp, ipv6-icmp, etc.) The field value must be normalized to lowercase for querying. See "Lowercase Capitalization" in the "Implementing ECS" section.	core	keyword	`tcp`
network.application	A name given to an application. This can be arbitrarily assigned for things like microservices, but also apply to things like skype, icq, facebook, twitter. This would be used in situations where the vendor or service can be decoded such as from the source/dest IP owners, ports, or wire format. The field value must be normalized to lowercase for querying. See "Lowercase Capitalization" in the "Implementing ECS" section.	extended	keyword	`aim`
network.protocol	L7 Network protocol name. ex. http, lumberjack, transport protocol. The field value must be normalized to lowercase for querying. See "Lowercase Capitalization" in the "Implementing ECS" section.	core	keyword	`http`
network.direction	Direction of the network traffic. Recommended values are: * inbound * outbound * internal * external * unknown When mapping events from a host-based monitoring context, populate this field from the host's point of view. When mapping events from a network or perimeter-based monitoring context, populate this field from the point of view of your network perimeter.	core	keyword	`inbound`
network.forwarded_ip	Host IP address when the source IP address is the proxy.	core	ip	`192.1.1.2`
network.community_id	A hash of source and destination IPs and ports, as well as the protocol used in a communication. This is a tool-agnostic standard to identify flows. Learn more at https://github.com/corelight/community-id-spec.	extended	keyword	`1:hO+sN4H+MG5MY/8hIrXPqc4ZQz0=`
network.bytes	Total bytes transferred in both directions. If `source.bytes` and `destination.bytes` are known, `network.bytes` is their sum.	core	long	`368`
network.packets	Total packets transferred in both directions. If `source.packets` and `destination.packets` are known, `network.packets` is their sum.	core	long	`24`

Observer fields

An observer is defined as a special network, security, or application device used to detect, observe, or create network, security, or application-related events and metrics. This could be a custom hardware appliance or a server that has been configured to run special network, security, or application software. Examples include firewalls, intrusion detection/prevention systems, network monitoring sensors, web application firewalls, data loss prevention systems, and APM servers. The observer.* fields shall be populated with details of the system, if any, that detects, observes and/or creates a network, security, or application event or metric. Message queues and ETL components used in processing events or metrics are not considered observers in ECS.

Field	Description	Level	Type	Example
observer.mac	MAC address of the observer	core	keyword
observer.ip	IP address of the observer.	core	ip
observer.hostname	Hostname of the observer.	core	keyword
observer.vendor	observer vendor information.	core	keyword
observer.version	Observer version.	core	keyword
observer.serial_number	Observer serial number.	extended	keyword
observer.type	The type of the observer the data is coming from. There is no predefined list of observer types. Some examples are `forwarder`, `firewall`, `ids`, `ips`, `proxy`, `poller`, `sensor`, `APM server`.	core	keyword	`firewall`

Organization fields

The organization fields enrich data with information about the company or entity the data is associated with. These fields help you arrange or filter data stored in an index by one or multiple organizations.

Field	Description	Level	Type	Example
organization.name	Organization name.	extended	keyword
organization.id	Unique identifier for the organization.	extended	keyword

Operating System fields

The OS fields contain information about the operating system.

The os fields are expected to be nested at: host.os, observer.os, user_agent.os.

Note also that the os fields are not expected to be used directly at the top level.

Field	Description	Level	Type	Example
os.platform	Operating system platform (such centos, ubuntu, windows).	extended	keyword	`darwin`
os.name	Operating system name, without the version.	extended	keyword	`Mac OS X`
os.full	Operating system name, including the version or code name.	extended	keyword	`Mac OS Mojave`
os.family	OS family (such as redhat, debian, freebsd, windows).	extended	keyword	`debian`
os.version	Operating system version as a raw string.	extended	keyword	`10.14.1`
os.kernel	Operating system kernel version as a raw string.	extended	keyword	`4.4.0-112-generic`

Process fields

These fields contain information about a process. These fields can help you correlate metrics information with a process id/name from a log message. The process.pid often stays in the metric itself and is copied to the global field for correlation.

Field	Description	Level	Type	Example
process.pid	Process id.	core	long
process.name	Process name. Sometimes called program name or similar.	extended	keyword	`ssh`
process.ppid	Process parent id.	extended	long
process.args	Process arguments. May be filtered to protect sensitive information.	extended	keyword	`['ssh', '-l', 'user', '10.0.0.16']`
process.executable	Absolute path to the process executable.	extended	keyword	`/usr/bin/ssh`
process.title	Process title. The proctitle, some times the same as process name. Can also be different: for example a browser setting its title to the web page currently opened.	extended	keyword
process.thread.id	Thread ID.	extended	long	`4242`
process.start	The time the process started.	extended	date	`2016-05-23T08:05:34.853Z`
process.working_directory	The working directory of the process.	extended	keyword	`/home/alice`

Related fields

This field set is meant to facilitate pivoting around a piece of data. Some pieces of information can be seen in many places in ECS. To facilitate searching for them, append values to their corresponding field in related.. A concrete example is IP addresses, which can be under host, observer, source, destination, client, server, and network.forwarded_ip. If you append all IPs to related.ip, you can then search for a given IP trivially, no matter where it appeared, by querying related.ip:a.b.c.d.

Field	Description	Level	Type	Example
related.ip	All of the IPs seen on your event.	extended	ip

Server fields

A Server is defined as the responder in a network connection for events regarding sessions, connections, or bidirectional flow records. For TCP events, the server is the receiver of the initial SYN packet(s) of the TCP connection. For other protocols, the server is generally the responder in the network transaction. Some systems actually use the term "responder" to refer the server in TCP connections. The server fields describe details about the system acting as the server in the network event. Server fields are usually populated in conjunction with client fields. Server fields are generally not populated for packet-level events.

Field	Description	Level	Type	Example
server.address	Some event server addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. Then it should be duplicated to `.ip` or `.domain`, depending on which one it is.	extended	keyword
server.ip	IP address of the server. Can be one or multiple IPv4 or IPv6 addresses.	core	ip
server.port	Port of the server.	core	long
server.mac	MAC address of the server.	core	keyword
server.domain	Server domain.	core	keyword
server.bytes	Bytes sent from the server to the client.	core	long	`184`
server.packets	Packets sent from the server to the client.	core	long	`12`

Service fields

The service fields describe the service for or from which the data was collected. These fields help you find and correlate logs for a specific service and version.

Field	Description	Level	Type	Example
service.id	Unique identifier of the running service. This id should uniquely identify this service. This makes it possible to correlate logs and metrics for one specific service. Example: If you are experiencing issues with one redis instance, you can filter on that id to see metrics and logs for that single instance.	core	keyword	`d37e5ebfe0ae6c4972dbe9f0174a1637bb8247f6`
service.name	Name of the service data is collected from. The name of the service is normally user given. This allows if two instances of the same service are running on the same machine they can be differentiated by the `service.name`. Also it allows for distributed services that run on multiple hosts to correlate the related instances based on the name. In the case of Elasticsearch the service.name could contain the cluster name. For Beats the service.name is by default a copy of the `service.type` field if no name is specified.	core	keyword	`elasticsearch-metrics`
service.type	The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`.	core	keyword	`elasticsearch`
service.state	Current state of the service.	core	keyword
service.version	Version of the service the data was collected from. This allows to look at a data set only for a specific version of a service.	core	keyword	`3.2.4`
service.ephemeral_id	Ephemeral identifier of this service (if one exists). This id normally changes across restarts, but `service.id` does not.	extended	keyword	`8a4f500f`

Source fields

Source fields describe details about the source of a packet/event. Source fields are usually populated in conjunction with destination fields.

Field	Description	Level	Type	Example
source.address	Some event source addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. Then it should be duplicated to `.ip` or `.domain`, depending on which one it is.	extended	keyword
source.ip	IP address of the source. Can be one or multiple IPv4 or IPv6 addresses.	core	ip
source.port	Port of the source.	core	long
source.mac	MAC address of the source.	core	keyword
source.domain	Source domain.	core	keyword
source.bytes	Bytes sent from the source to the destination.	core	long	`184`
source.packets	Packets sent from the source to the destination.	core	long	`12`

URL fields

URL fields provide a complete URL, with scheme, host, and path.

Field	Description	Level	Type	Example
url.original	Unmodified original url as seen in the event source. Note that in network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often just represented as a path. This field is meant to represent the URL as it was observed, complete or not.	extended	keyword	`https://www.elastic.co:443/search?q=elasticsearch#top or /search?q=elasticsearch`
url.full	If full URLs are important to your use case, they should be stored in `url.full`, whether this field is reconstructed or present in the event source.	extended	keyword	`https://www.elastic.co:443/search?q=elasticsearch#top`
url.scheme	Scheme of the request, such as "https". Note: The `:` is not part of the scheme.	extended	keyword	`https`
url.domain	Domain of the request, such as "www.elastic.co". In some cases a URL may refer to an IP and/or port directly, without a domain name. In this case, the IP address would go to the `domain` field.	extended	keyword	`www.elastic.co`
url.port	Port of the request, such as 443.	extended	integer	`443`
url.path	Path of the request, such as "/search".	extended	keyword
url.query	The query field describes the query string of the request, such as "q=elasticsearch". The `?` is excluded from the query string. If a URL contains no `?`, there is no query field. If there is a `?` but no query, the query field exists with an empty string. The `exists` query can be used to differentiate between the two cases.	extended	keyword
url.fragment	Portion of the url after the `#`, such as "top". The `#` is not part of the fragment.	extended	keyword
url.username	Username of the request.	extended	keyword
url.password	Password of the request.	extended	keyword

User fields

The user fields describe information about the user that is relevant to the event. Fields can have one entry or multiple entries. If a user has more than one id, provide an array that includes all of them.

The user fields are expected to be nested at: client.user, destination.user, host.user, server.user, source.user.

Note also that the user fields may be used directly at the top level.

Field	Description	Level	Type	Example
user.id	One or multiple unique identifiers of the user.	core	keyword
user.name	Short name or login of the user.	core	keyword	`albert`
user.full_name	User's full name, if available.	extended	keyword	`Albert Einstein`
user.email	User email address.	extended	keyword
user.hash	Unique user hash to correlate information for a user in anonymized form. Useful if `user.id` or `user.name` contain confidential information and cannot be used.	extended	keyword

User agent fields

The user_agent fields normally come from a browser request. They often show up in web service logs coming from the parsed user agent string.

Field	Description	Level	Type	Example
user_agent.original	Unparsed version of the user_agent.	extended	keyword	`Mozilla/5.0 (iPhone; CPU iPhone OS 12_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0 Mobile/15E148 Safari/604.1`
user_agent.name	Name of the user agent.	extended	keyword	`Safari`
user_agent.version	Version of the user agent.	extended	keyword	`12.0`
user_agent.device.name	Name of the device.	extended	keyword	`iPhone`

Use cases

These are example on how ECS fields can be used in different use cases. Most use cases not only contain ECS fields but additional fields which are not in ECS to describe the full use case. The fields which are not in ECS are in italic.

Contributions of additional uses cases on top of ECS are welcome.

Reserved Section Names

ECS does not define the following field sets yet, but the following are expected in the future. Please avoid using them:

match.*
protocol.*
threat.*
vulnerability.*

Implementing ECS

Guidelines

The document MUST have the @timestamp field.
The data type defined for an ECS field MUST be used.
It SHOULD have the field ecs.version to define which version of ECS it uses.
As many fields as possible should be mapped to ECS.

Writing fields

All fields must be lower case
Combine words using underscore
No special characters except _

Naming fields

Present tense. Use present tense unless field describes historical information.
Singular or plural. Use singular and plural names properly to reflect the field content. For example, use requests_per_sec rather than request_per_sec.
General to specific. Organise the prefixes from general to specific to allow grouping fields into objects with a prefix like host.*.
Avoid repetition. Avoid stuttering of words. If part of the field name is already in the prefix, do not repeat it. Example: host.host_ip should be host.ip.
Use prefixes. Fields must be prefixed except for the base fields. For example all host fields are prefixed with host.. See dot notation in FAQ for more details.
Do not use abbreviations. (A few exceptions like ip exist.)

Normalization

In order to be help allow for correlation across different sources, ECS must sometimes enforce normalization on field values.

Lowercase Capitalization

Some field descriptions mention they should be normalized to lowercase. Different approaches can be taken to accomplish this. The goal of requesting this is to avoid the same value appearing distinctly in aggregations, or avoid having to search for all capitalizations possible (e.g. IPV4, IPv4, ipv4).

The simplest implementation of this requirement is to lowercase the value before indexing in Elasticsearch. This can be done with a Logstash filter or an Ingest Node processor, for example. Another approach that satisfies the goal is to configure the keyword indexing of the field to use a normalize filter using the lowercase filter. The normalize filter leaves your data unmodified (the document still shows "IPv4", for example). However the value in the index will be lowercase. This satisfies the requirement of predictable querying and aggregation across data sources.

Understanding ECS conventions

Multi-fields text indexing

Elasticsearch can index text multiple ways:

text indexing allows for full text search, or searching arbitrary words that are part of the field.
keyword indexing allows for much faster exact match filtering, prefix search, and allows for aggregations (what Kibana visualizations are built on).

By default, unless your index mapping or index template specifies otherwise (as the ECS index template does), Elasticsearch indexes text field as text at the canonical field name, and indexes a second time as keyword, nested in a multi-field.

Default Elasticsearch convention:

Canonical field: myfield is text
Multi-field: myfield.keyword is keyword

For monitoring use cases, keyword indexing is needed almost exclusively, with full text search on very few fields. Given this premise, ECS defaults all text indexing to keyword at the top level (with very few exceptions). Any use case that requires full text search indexing on additional fields can simply add a multi-field for full text search. Doing so does not conflict with ECS, as the canonical field name will remain keyword indexed.

ECS multi-field convention for text:

Canonical field: myfield is keyword
Multi-field: myfield.text is text

Exceptions

The only exceptions to this convention are fields message and error.message, which are indexed for full text search only, with no multi-field. These two fields don't follow the new convention because they are deemed too big of a breaking change with these two widely used fields in Beats.

Any future field that will be indexed for full text search in ECS will however follow the multi-field convention where text indexing is nested in the multi-field.

IDs and most codes are keywords, not integers

Despite the fact that IDs and codes (e.g. error codes) are often integers, this is not always the case. Since we want to make it possible to map as many systems and data sources to ECS as possible, we default to using the keyword type for IDs and codes.

Some specific kinds of codes are always integers, like HTTP status codes. If those have a specific corresponding specific field (as HTTP status does), its type can safely be an integer type. But generic field like error.code cannot have this guarantee, and are therefore keyword.

FAQ

What are the benefits of using ECS?

The benefits to a user adopting these fields and names in their clusters are:

Data correlation. Ability to easily correlate data from the same or different sources, including:
- data from metrics, logs, and apm
- data from the same machines/hosts
- data from the same service
Ease of recall. Improved ability to remember commonly used field names (because there is a single set, not a set per data source)
Ease of deduction. Improved ability to deduce field names (because the field naming follows a small number of rules with few exceptions)
Reuse. Ability to re-use analysis content (searches, visualizations, dashboards, alerts, reports, and ML jobs) across multiple data sources
Future proofing. Ability to use any future Elastic-provided analysis content in your environment without modifications

What if I have fields that conflict with ECS?

The rename processor can help you resolve field conflicts. For example, imagine that you already have a field called "user," but ECS employs user as an object. You can use the rename processor on ingest time to rename your field to the matching ECS field. If your field does not match ECS, you can rename your field to user.value instead.

What if my events have additional fields?

Events may contain fields in addition to ECS fields. These fields can follow the ECS naming and writing rules, but this is not a requirement.

Why does ECS use a dot notation instead of an underline notation?

There are two common key formats for ingesting data into Elasticsearch:

Dot notation: user.firstname: Nicolas, user.lastname: Ruflin
Underline notation: user_firstname: Nicolas, user_lastname: Ruflin

For ECS we decided to use the dot notation. Here's some background on this decision.

What is the difference between the two notations?

Ingesting user.firstname: Nicolas and user.lastname: Ruflin is identical to ingesting the following JSON:

"user": {
  "firstname": "Nicolas",
  "lastname": "Ruflin"
}

In Elasticsearch, user is represented as an object datatype. In the case of the underline notation, both are just string datatypes.

NOTE: ECS does not use nested datatypes, which are arrays of objects.

Advantages of dot notation

With dot notation, each prefix in Elasticsearch is an object. Each object can have parameters that control how fields inside the object are treated. In the context of ECS, for example, these parameters would allow you to disable dynamic property creation for certain prefixes.

Individual objects give you more flexibility on both the ingest and the event sides. In Elasticsearch, for example, you can use the remove processor to drop complete objects instead of selecting each key inside. You don't have to know ahead of time which keys will be in an object.

In Beats, you can simplify the creation of events. For example, you can treat each object as an object (or struct in Golang), which makes constructing and modifying each part of the final event easier.

Disadvantage of dot notation

In Elasticsearch, each key can only have one type. For example, if user is an object, you can't use it as a keyword type in the same index, like {"user": "nicolas ruflin"}. This restriction can be an issue in certain datasets. For the ECS data itself, this is not an issue because all fields are predefined.

What if I already use the underline notation?

Mixing the underline notation with the ECS dot notation is not a problem. As long as there are no conflicts, they can coexist in the same document.

paulpc/ecs