[FEATURE]Add `fillnull` command to PPL
Closed this issue · 2 comments
Description:
We propose adding a fillnull
command to OpenSearch's Piped Processing Language (PPL) to provide a convenient way to handle null or missing values in query results. This feature would be similar to the fillnull
command in Splunk's SPL, enhancing PPL's data cleaning and preparation capabilities.
Proposed Functionality:
- The 'fillnull' command should allow users to replace null values with a specified value.
- It should support filling nulls for specific fields or all fields.
- The command should allow different fill values for different fields.
- It should support conditional filling based on other field values or expressions.
Example Usage:
... | fillnull value=0
This would replace all null values in all fields with 0.
... | fillnull value=N/A field1, field2
This would replace null values in field1 and field2 with "N/A".
... | fillnull field1=0 field2="Unknown" field3=false
This would fill null values in different fields with different values.
... | eval new_field = if(field1 == "category1", field2, null) | fillnull value=0 new_field
This example uses eval
to create a new field (or overwrite an existing one) based on a condition, and then use fillnull
to handle the null values
...
| eval field1 = if(field1 == "category1", field1, null), field2 = if(field2 == "category2", field2, null)
| fillnull field1=0 field2="Unknown"
This example uses multiple eval expressions to handle different conditions for multiple fields, followed by fillnull
implementation Considerations:
- Ensure compatibility with existing PPL commands and syntax
- Optimize performance for large datasets with many null values
- Provide clear documentation and examples for users
- Consider type-checking or type-conversion for filled values