goatshriek/stumpless

implement format checking of param values

goatshriek opened this issue · 1 comments

The syslog RFC (RFC 5424) specifies that param values only contain UTF-8 encoded messages. Checking the characters of param values will help users stay compliant with the standard, and also provide a point of mitigation for encoding bugs and vulnerabilities.

Check out the Contributing Guidelines and the development guide for the basics on working with stumpless and submitting changes.

General Approach

There are a few details left out of the following approach, for you to fill in as you encounter them. If you find you need help, please ask here or on the project gitter and someone can help you get past the stumbling block.

There are a number of functions that allow a param value to be supplied: the constructor for params, and accessor and mutator functions for entries and elements. Since this check will need to take place in several places, you will be best served by calling a single function wherever required.

src/validate.c contains the validation functions used within the library. You will need to define a new function that performs this validation in this module, and declare it in include/private/validate.h. Review the documentation on defining new functions here.

To understand what a valid UTF-8 string is exactly, you'll need to review RFC 3629. If that seems a bit overwhelming though, you can have a look at some code that already implements a UTF-8 validation check. This is in test/helper/utf8.cpp in the function TestUTF8Compliance. You can use this code as a reference, and be sure to raise an invalid encoding error when failures occur using the raise_invalid_encoding function. You'll also need to add a new error message: there is documentation about how to do this here.

After this change, add calls to your new function wherever a param value is passed as a parameter in the library, which are listed below.

  • stumpless_new_param (in src/param.c) This function is used by all others that create a new param, and so explicit checks are not needed in those.
  • stumpless_set_param_value (in src/param.c) This function is used by all others that modify a param value, and so explicit checks are not needed in those.

Update the documentation for the first two of these functions (creation and modification) to specify that the param value is restricted to UTF-8. The relevant documentation is in a doxygen comment format in the include/stumpless/param.h header file. Add the note about this to the @param descriptions of the param value parameters.

Don't forget to add tests for your new functionality! These should go alongside the tests for the functions that have param values. Find the existing tests for these functions as starting points, and add new tests that make sure that param values with valid UTF-8 are accepted, but non-UTF-8 strings are rejected.

@goatshriek
I submitted a PR for this issue. Could I ask you to review the PR #326?