An Nginx module for rule-based request filtering, rate-limiting, and access control for many Nginx servers sharing rules and data. It's loosely inspired by iptables -- well, not syntactically, or schematically, or even conceptually. I guess iptables is more like a spiritual predecessor, and Wafflex is its own animal designed for doing stuff to HTTP requests rather than network packets.
Rules are simple, non-nesting, non-Turing-complete decisions to conditionally perform some actions on an HTTP request. Basically, one rule is one if-then-else or switch statement. Rules are organized in rule lists.
Rules can make use of limiters, which are pre-configured limits applied to named counters used for rate-limiting and flag checks.
A rule list is a sequence of rules that keeps executing rules until the end of the list, unless a rule executes a final action. Rule lists are stored in phase tables.
A phase table specifies lists of rule lists that are run at different phases in the HTTP request-response cycle (after headers, after request completion, after redirect, etc.). Each request phase can have one or more rule lists, which are executed in-order.
Finally, all of the above is stored in a rule set.
The rule-set is the root-level structure that stores the phase table, named rule lists, named rules and limits for limiters.
{ //rule set
"limits": {
"limiterName1": /*limiter*/,
"limiterName2": /*limiter*/,
/*...*/
},
"phases": /*phase table*/,
"lists": { //named lists
listName1: /*rule list*/,
listName2: /*rule list*/,
/*...*/
},
"rules": { //named rules
"ruleName1": /*rule*/,
"ruleName2": /*rule*/,
/*...*/
}
}
A rule set MUST have a phase table, but MAY NOT have named rules, limiters and lists. The rule lists and named rules MUST be uniquely named and can be referenced by name in the phase table. Limiters MUST also be uniquely named.
The phase table keeps track of rule lists associated with HTTP request/response phases.
{
"connect": [ /*rule-list*/, /*rule-list*/, /*...*/ ],
"headers": [ /*rule-list*/, /*rule-list*/, /*...*/ ],
"request": [ /*rule-list*/, /*rule-list*/, /*...*/ ]
}
The possible HTTP request phases are:
connect
: After a TCP connection is establishedtls-connect
: After TLS handshake is performed and TLS session is establishedheaders
: After a request's headers are processedbody-data
: After part (or all) of a request's body is readrequest
: After the entire request is processedproxy-response
: After an upstream response is receivedresponse-headers
: After response headers are generated, but before they are sent to the client.response-data
: After response headers and body are generated, but before they are sent to the client.response
: After response headers and body are generated, but before they are sent to the client.
Note that initially, a subset of these phases will be implemented (request
, response
, and maybe a few more)
Rule lists can be defined inside the phase table, or they can be referenced by name.
An ordered list of rules.
//short-form:
[ /*rule*/, /*rule*/, /*...*/ ]
//long-form
{
"name": "my-rules", //optional globally unique list name
"rules": [ /*rule*/, /*rule*/, /*...*/ ]
}
Lists declared in the rule set's lists
hash have their names set to their hash keys:
{ //rule set
"lists": {
"myList": [ /*rules*/ ]
},
/*...*/
}
//this creates a named list:
{ // rule list
"name": "myList",
"rules": [ /*rules*/ ]
}
All list names MUST be unique to the rule set.
Lists could also be unordered, meaning rule execution order is up to the module. Unordered lists can be heavily optimized through adaptive reordering and partial condition evaluation trees. (This is an advanced execution engine feature, and will not be part of the MVP.)
{
"rules": [ /*rule*/, /*rule*/, /*...*/ ],
"unordered": true
}
Rules can be define inside rule lists, and named rules can be referenced by the name:
{ //rule list
"rules": [ "ruleName1", "ruleName2", {/*rule definition*/} ]
}
Rules are defined in JSON.
//example rule
{
"name": "ratelimit",
"info": "This rule rate-limits by ip",
"if": {"#limit-break" : {
"name": "rate-limit",
"key": "$request_real_ip"
}},
"then": [
{"#tag": "slowdown"},
{"#reject": {"status": 403, "body": "Slow down!!!"}}
]
}
Rules have 3 forms:
{
"if": /*condition*/
"then": /*actions*/
"else": /*actions*/ //optional
}
{
"if-any": [ /*condition*/, /*condition*/, /*...*/ ],
"then": /*actions*/
"else": /*actions*/ //optional
}
{
"if-all": [ /*condition*/, /*condition*/, /*...*/ ],
"then": /*actions*/
"else": /*actions*/ //optional
}
if-any
executes until the first true
condition (think of it as if(condition1 || condition2 || ...)
)
if-all
executes until the first false
condition (think of it as if(condition1 && condition2 && ...)
)
{
"switch": [
[ /*condition*/, /*actions*/],
[ /*condition*/, /*actions*/],
/*...*/
]
}
{
"do": /*actions*/
}
Rules can also have additional properties, such as key
, name
, track-stats
and logging
which will be discussed later.
Actions are things done to a request, like adding or removing a header, redirecting to an upstream server, rejecting a request with a 400 status code, or closing the connection. An action MAY end the processing of rules for the current request; this is called a final action.
/*actions*/ [ /*action*/, /*action*/, /*...*/ ] || /*action*/
/*action*/ "#action-name" || {"#action-name": /*params*/}
/*params*/ { /*...*/ } || [ /*...*/ ] || string || number
Arrays of actions are executed in-order until the end of the array, even in the presence of one or more final actions. Strings in action parameters are interpolated.
//examples
//array of actions:
[ {"#tag": "processed"}, "#accept" ]
[ "#reject", {"#tag": "rejected"} ] //will be tagged 'rejected' even though '#reject' is a final action.
//single action
"#reject"
{"#reject": 404}
Conditions are statements that evaluate to true
or false
. They have the same form as an action:
/*condition*/ "#condition-name" || {"#condition-name": /*arguments*/}
/*arguments*/ { /*...*/ } || [ /*...*/ ] || string || number
As with actions, strings in condition arguments are interpolated.
//single condition
"#true" //always true
{"#match": ["$realip_remote_addr", "127.0.0.1"]} // true if request client's ip is 127.0.0.1
Limiters are used to track a rate, and whether that rate exceeds some threshold. Limiters are defined outside of rules, rule lists, and phase tables.
//limiter syntax
{
"name": "limiter-name" //globally unique
"info": "optional limiter description",
"interval": /*time-interval*/,
"limit": number //maximum value for that interval
"sync-steps": integer (default 4) //optional
"burst": "burst-limiter-name", //optional
"burst-expire": /*time-interval*/ //optional
}
/*time-interval*/ number (seconds) || nginx-time-range ("10s" /*10 seconds*/ || "1h" /*1 hour*/ || "5d" /*5 days*/ etc)
All limiters must have names unique to the rule set.
Limiters are used to perform rate limit checks on numeric counters associated with a key specified in a rule. This counter is incremented during the execution of rules, and decreases linearly at a rate of limit
/interval
until it reaches 0. The counter values are shared and synchronized between Nginx servers.
A Limiter is checked and the corresponding counter auto-incremented with the #limit-break
condition.
{ //limit
"name": "request-rate",
"interval": 60, //seconds
"limit": 100
}
{ //rule
"if": {"#limit-break": {
"name": "request-rate", //limiter name
"key": "$request_real_ip", //key for current limiter rate value
//increment: 1 //increment current limit counter value by 1 by default
}},
"then": "#reject"
}
Each limiter has a limit and a counter which is stored for that limit at a specified key. Different-named limiters store their own counters at the same key:
{ //limit
"name": "request-rate",
"interval": 60, //seconds
"limit": 100
}
{ //another limit
"name": "weekly-allowance",
"interval": '7d', //7 days
"limit": 1000000 // a million requests a week
}
{ //rule
"if-any": [
{"#limit-break": {
name: "request-rate", //limiter name
key: "$request_real_ip", //key for current limiter rate value
}},
{"#limit-break": {
name: "weekly-allowance",
key: "$request_real_ip" // same key as above, but tracking a different limiter's counter
}}
],
"then": "#reject"
}
Because it is common to use the same key for multiple limiter counters, the default key can be specified at the rule level:
{ //limit
"name": "request-rate",
"interval": 60, //seconds
"limit": 100
}
{ //another limit
"name": "weekly-allowance",
"interval": '7d', //7 days
"limit": 1000000 // a million requests a week
}
{ //rule
"key": "$request_real_ip", //default key for limiter counters
"if-any": [
{"#limit-break": "request-rate"}, //uses default rule-level key
{"#limit-break": "weekly-allowance"} //same as above
],
"then": "#reject"
}
By default, the #limit-break
condition auto-increments the numeric value stored at key specified by key
. To test a limiter without incrementing the counter, use #limit-check
, or explicitly set the increment
argument for #limit-break
to 0. A limiter counter can also be incremented as an [action][#actions] with #limit-increment
Two limiters can be combined to create a limiter with a burst rate:
{ //limit
"name": "burstable-rate-limit",
"interval": 60, //seconds
"limit": 100,
"burst": "burst-rate"
"burst-expire": "1h"
}
{ //burst rate
"name": "burst-limit",
"interval": "1h",
"limit": 100000 // burst at 100K/hour
}
{ // rule using the bursty limit
if: {'#limit-break': "burstable-rate-limit"},
then: "#reject"
}
Here, the burstable-rate-limit
has a burst rate defined by burst-limit
. This means burstable-rate-limit
starts rate-limiting only after the burst-limit
has been exceeded. In the above case, the burstable-rate-limit
(100/hour) is active only after burst-limit
(100K/hour) is reached. The burstable-rate-limit
remains in effect as long as burst-limit
is exceeded, or for burst-expire
amount of time after the burst-limit
is first exceeded -- whichever is longer.
Limiter counters can be shared between Nginx instances. This allows for distributed rate tracking among a collection of servers. By default, counters are shared every time the value increases by 25% of the limit
.
The sync-steps
value controls how often counters are shared between Nginx instances. Between 0 and limit
, the counter value will be shared sync-steps
times. Put another way, the counter is shared every time its value increases by limit
/sync-steps
. Some examples:
sync-steps: 0
: counter value is never shared with other Nginx instances.sync-steps: 1
: counter value is shared when the limit is reached, doubled, tripled, etc.sync-steps: 4
: counter value is shared every time it is increased bylimit
/4, i.e. every 25% of thelimit
sync-steps: 100
: counter value is shared every time it increases by 1% orlimit
, or 100 times between 0 andlimit
, betweenlimit
and 2 *limit
and so on.
Limiters with a limit
of 1 can behave like shared boolean flags with an expiration time.
{//limit
"name": "ip-ban",
"limit": 1,
"interval": "1d"
//ban flag that stays active for 1 day
}
{ //ban the ip when the request has a Ban-Me: 1 header
"if": {"#match": ["$http_ban_me", "1"]},
"then": {"#flag": {"name": "ip-ban", "key": "$request_real_ip"}} //flag this guy as banned
}
{ //check if banned
"if": {"#flag-check": {name: "ip-ban", key: "$request_real_ip"}},
"then": "#reject" // you're banned, dude. go home.
}
#flag-check
is the same as #limit-check
, and #flag
same as #limit-increment
.
Strings passed to conditions and actions are automatically interpolated using the Nginx string interpolation syntax, with Nginx request variables.
"static string"
"$http_host" //evaluates to the value of the Host header
"foo/bar/${request_real_ip}:$url" //compound interpolated string
There's some room for heavy optimization in reusing interpolated strings, especially when used as keys, or if they are going to be hashed. Optimizations may not be present in MVP.
match strings for equality
{"#match": [ "string1", "string2", /*...*/ ]} //must all be the same to be true
match a string by regular expression
{"#match-regex": [ "string", "/regular_expression.*/"]}
Note that if the regular expression has any interpolation variables in it, it cannot be optimized and will need to be recompiled on every request.
Increment limiter counter, and check if a limit has been broken (exceeded)
{"#limit-break": {
"name": 'limit-name',
"key": "limiter-counter-key",
"increment": 1 //default value
}}
If key
is absent, the rule's default key
is used.
Same as #limit-break
, but with increment: 0
Same as #limit-break
. Useful for Limiters with limit: 1
that are being used as flags.
Checks if a tag has been set for the request by the #tag
command.
{"#tag-check": "tag-name"}
Perform a non-blocking subrequest to an Nginx location, and compare the response to a set of matching conditions.
(NOT included for MVP)
{"#match-subrequest": [/*subrequest*/, /*match_conditions*/]}
/*subrequest*/
"/subrequest_path" //HTTP GET subrequest to this path
{
"path": "/subrequest_path",
"method": "GET" // default
"forward-headers": true //default. use headers from incoming request,
"set-headers": {"Header-Name": "$http_header_name", "Other-Header-Name": "foobar"}
}
/*match_conditions*/
"ok" // response code 200-299
number // match response code number
{
"code": number || [ code1, code2, code3 ],
"headers": {
"Header-Name": "match-Header-Value"
},
"body": "match-body-value"
}
Always true. Useful as a default switch condition.
Always false. Might be useful when writing rules.
Reject a request. This is a final action, after which no more rules will be processed for this request.
//short-form
"#reject"
//long-form
{"#reject": {
"status": number, //HTTP response status code, default 403
"body": "response_body" //HTTP response body, optinal
}}
Accept a request. This is a final action, after which no more rules will be processed for this request.
//short form
"#accept"
Pass request to an upstream proxy. See the Nginx proxy_pass
command.
This is a final action, after which no more rules will be processed for this request.
{'#proxy-pass': "http://hostname-or-upstream-name/path"}
Set proxy header. See the Nginx proxy_set_header
command.
//set one or many headers
{'#proxy-set-header': {
"Proxy-Header-Name":"$http_header_name",
"X-Real-IP": "$request_real_ip"
}}
Increment limit counter
//short form (if rule-wide default key is provided)
{"#limit-increment": "limiter-name"}
//long-form
{"#limit-increment": {
"name": "limiter-name",
"key": "limit-counter-key", //can be omitted if rule-level key given
"increment": 1 //default
}}
Increment flag (limiter with limit:1
) by 1. Same as #limit-increment
//short form (if rule-wide default key is provided)
{"#flag": "flag-name"}
//long-form
{"#flag": {
"name": "flag-name",
"key": "flag-key", //can be omitted if rule-level key given
"increment": 1 //default
}}
Set limit counter back to 0.
//short form (if rule-wide default key is provided)
{"#limit-reset": "limit-name"}
//long-form
{"#limit-reset": {
"name": 'limit-name',
"key": "limit-counter-key" //can be omitted if rule-level key given
}}
Set flag back to 0. Same as #limit-reset
Set a string tag for the request. which adds a RoF-Tag-<tagname>: 1
header to the request.
{"#tag": "tag-name"}
Deletes the given string tag for the request. which also removes the RoF-Tag-<tagname>: 1
header from the request. If tag is already absent, does nothing.
{"#tag-reset": "tag-name"}
Perform a non-blocking subrequest to an Nginx location.
(NOT included for MVP)
{"#subrequest": "/subrequest_path"}
{"#subrequest": {
"path": "/subrequest_path",
"forward-headers": true //default. use headers from incoming request,
"set-headers": {"Header-Name": "$http_header_name", "Other-Header-Name": "foobar"}
}
}
Rule sets can be loaded from Redis, a file, or Consul on Nginx startup. (Consul may be omitted for MVP). During runtime, rule sets, lists, limits, and counters are stored on a Redis server or cluster. (Cluster support may be omitted for MVP). Rule sets, lists, limits, and rules can be created, updated, and deleted at runtime via a command-line tool which runs updating commands on the Redis server. In turn, Redis notifies all the Nginx servers, and data is updated as-needed.
Aggregate rule execution information will be available for logging through variables available upon request completion:
$rof_request_time
: time to run all rules$rof_request_rules
: number of rules processed$rof_request_rules_percent
: percent of rules processed$rof_request_final_rule
: name/id of rule that either accepted or rejected the request (empty if neither)$rof_request_final_list
: name/id of list that contained the final rule (empty if none)$rof_request_final_phase
: name of phase in the phase table
Individual rule statistics are not collected by default, but can be enabled on specific rules with the track-stats
property:
{
"if": /*condition*/,
"then": /*action*/
"track-stats": true
}
With track-stats
enabled, runtime statistics for the rule will be collected, including totals for number of times executed, and the number of timers a request was rejeced via the rule. This data can be accessed via an Nginx location akin to the stub_status
setting.
Specific rules' processing sequence can be enabled with the log
property:
{
"if": /*condition*/,
"then": /*action*/
"log": true
}
This will log each step of the rule's execution to a separate Wafflex log using the standard Nginx error log format, with the text being presented in a human-readable format.
Additionally, the logging of requests matching some pattern may be enabled in a manner to be determined later. (Not part of MVC)