VMOD file

Varnish Module for reading files that may be updated at intervals

Manual section

3

SYNOPSIS

import file;

# File reader object
new <obj> = file.reader(STRING name [, STRING path] [, DURATION ttl])
STRING <obj>.get()
VOID <obj>.synth()
BLOB <obj>.blob()
BOOL <obj>.error()
STRING <obj>.errmsg()
BOOL <obj>.deleted()
BYTES <obj>.size()
TIME <obj>.mtime()
DURATION <obj>.next_check()

# VMOD version
STRING file.version()

DESCRIPTION

VMOD file is a Varnish module for reading the contents of a file and caching its contents, returning the contents for use in the Varnish Configuration Language (VCL), and checking if the file has changed after specified time intervals elapse.

VMOD std, provided with the Varnish distribution, has the function std.fileread()_, which reads the contents of a file exactly once on the first invocation, caches the contents, and returns the cached contents on every subsequent invocation. The cache is static, so the cached file contents do not change when VCL is reloaded, or at any other time until Varnish stops. This minimizes file I/O during request/response transactions, which is incurred only on the first invocation. But it means that changed file contents do not become available in VCL with std.fileread()_ unless Varnish is re-started.

VMOD file provides a reader object, which caches file contents during the invocation of vcl_init (hence at VCL load time). The object is provided by default with a time interval, for which the concept "TTL" (time to live) is re-applied. When a TTL is set, the file is periodically checked for changes in a background thread, every time the TTL elapses. If the file has changed, then the file cache is reloaded with its new contents, which then become available for subsequent accesses via the reader object in VCL:

import file;

sub vcl_init {
  # Cache the contents of a file, and check it for changes
  # every 30 seconds.
  new rdr = file.reader("/path/to/myfile", ttl=30s);
}

sub vcl_recv {
  # Read the cached contents of the file. If the file was found
  # to have changed, then the new contents are returned by .get().
  set req.http.Myfile = rdr.get();
}

The content cache takes the form of a memory-mapping of the file, see mmap(2). This has some consequences that are discussed in the sections File deletion and file updates and LIMITATIONS below.

Since the update checks run in the background, the file I/O that the checks require is not incurred during any request/response transaction.

When a VCL instance transitions to the cold state, the file update checks for any of the instance's objects are suspended (see vcl.state_ in varnish-cli(7)). When it transitions back to the warm state (which also happens during an invocation of vcl.use_ if the VCL had previously been cold), then the files are immediately checked for changes, updating the cached contents if necessary, and the update checks in the background resume at the TTL interval.

File deletion and file updates

POSIX mandates that mmap(2) adds a reference for the file, which is not removed until the file is unmapped. In particular, it is not removed when the file is deleted -- the mapping continues to access the file's contents, even after deletion. (In that case, the file is not physically removed, but is no longer accessible by name in the filesystem.)

For this reason, if an update check finds that the file has been deleted, it is not considered an error, provided that the file has already been mapped. (It is an error if the file does not exist at initialization.) The file is considered unchanged, and the cached contents remain valid, at least until the next check. The .deleted()_ method of the reader object can be used in VCL to detect this situation.

POSIX leaves unspecified whether changes in the underlying file immediately become visible in the memory mapping. On a system like Linux, changes are immediately visible, and hence will be reflected immediately be the VMOD. While this may seem ideal for getting fast updates in VCL, it is in fact problematic:

  • File writes are not atomic; so the VMOD may return partial and inconsistent contents for the file.
  • If the changed file is longer than the originally mapped file, the portion that is longer than the original file is not mapped. Contents returned by the VMOD will appear truncated.

For these reasons, this is a reliable method to update a file:

  • Delete the file
  • Write the new contents to a new file of the same name (same path location)

This is the only method for updating files that the VMOD supports.

After the deletion step, the previously cached contents remain valid. When the next update check detects the change performed by the second step, the new contents are mapped, and become available in their correct form via the VMOD.

Other means of updating the file might "happen" to work, some of the time. But if not, it is not considered a bug of the VMOD. The VMOD works as designed only if the two-step procedure for updating files is followed.

new xreader = file.reader(STRING name, STRING path, DURATION ttl, BOOL log_checks)

new xreader = file.reader(
   STRING name,
   STRING path="/usr/local/etc/varnish:/usr/local/share/varnish/vcl:/etc/varnish:/usr/share/varnish/vcl",
   DURATION ttl=120,
   BOOL log_checks=0
)

Create an object to read and cache the contents of the file named name, and optionally check the file for changes at the interval ttl. name MAY NOT be the empty string. If ttl is set to 0s, then no periodic checks are performed. ttl MAY NOT be < 0s. By default, ttl is 120 seconds.

If name denotes an absolute path (beginning with /), then the file at that path is read. Otherwise, the file is searched for in the directories given in the colon-separated string path. The file MUST fulfill the following conditions:

  • The file MUST be accessible to the owner of the Varnish child process.
  • The process owner MUST have read permissions on the file.
  • The file MUST be a regular file, or a symbolic link pointing to a regular file.

If any of these are not true of name, or if no such file is found on the path, then the VCL load fails with an error message.

The default value of path combines the default values of the varnishd parameter vcl_path_ for development builds (installed in /usr/local) and production deployments (installed in /usr), with the development directories first. path MAY NOT be the empty string.

If there is an error finding or reading the file, then the VCL load fails with a message describing the error. If the read succeeds, then the file contents are cached, and are available via the reader object's methods.

If initialization succeeds and ttl > 0s, then update checks begin at that interval. A file is considered to have changed if any of its stat(2) fields mtime, dev or ino change. As discussed above, the file is considered unchanged if the update check finds the the file has been deleted, provided that it has already been mapped; then the previously cached contents continue to be valid. If the file has changed when a check is performed, it is re-read and the new contents are cached, for access via the object's methods.

If an error is encountered when a check attempts to re-read the file, then subsequent method calls attempting to access the contents invoke VCL failure (see ERRORS below), with the VCL_Error message in the Varnish log describing the error.

Checks continue at the ttl interval, regardless of any error. If the next update check after an error succeeds (because the problem has been fixed in the meantime), then the new contents are cached, and object methods can access the contents successfully.

If log_checks is true (default false), then the activity of update checks is logged in the Varnish log using the tag Debug (see vsl(7)). By default, Debug logs are filtered from the Varnish log; to see them, add Debug to the varnishd parameter vsl_mask_, for example by invoking varnishd with -p vsl_mask=+Debug. Since update checks do not happen during any request/response transaction, they are logged with pseudo-XID 0, and are only visible when the log is read with raw grouping, for example by invoking varnishlog(1) with -g raw.

Regardless of the value of log_checks, errors encountered during update checks are logged with the tag Error, also with XID 0 (and hence visible in raw grouping). A message is always written to the log with the Debug tag (using XID 0) if an update check finds that the file has been deleted, but is already mapped (and hence is considered unchanged).

Examples:

sub vcl_init {
  # A reader for the file at the absolute path, using default
  # ttl=120s.
  new foo = file.reader("/path/to/foo");

  # A reader for the file on the default search path, with
  # update checks every five minutes.
  new synth_body = file.reader("synth_body.html", ttl=300s);

  # A reader for the file on the given search path, with
  # default TTL, and logging for update checks.
  new bar = file.reader("bar", path="/var/run/d1:/var/run/d2",
                         log_checks=true);

  # A reader for the file with no update checks.
  new baz = file.reader("baz", ttl=0s);
}

STRING xreader.get()

Return the contents of the file specified in the constructor, as currently cached. If the most recent update check encountered an error, then VCL failure is invoked (see ERRORS).

Example:

sub vcl_deliver {
  set resp.http.Foo = foo.get();
}

Take care if you use .get() to set a header, as in the example, that the file contents do not end in a newline. If so, then the newline appears after the header contents, resulting in an empty line after the header. Since an empty line separates the headers from the body in an HTTP message, this is very likely to result in an invalid message.

VOID xreader.synth()

Generate a synthetic response body from the file contents. This method may only be called in vcl_synth or vcl_backend_error. Invokes VCL failure if the most recent update check encountered an error, or if invoked in any other VCL subroutine besides the two that are permitted.

Example:

sub vcl_synth {
  synth_body.synth();
}

sub vcl_backend_error {
  synth_body.synth();
}

BLOB xreader.blob()

Return the file's contents as a BLOB. Invokes VCL failure if the most recent update check encountered an error.

Example:

import blob;

# Set the backend response body to the hex-encoded contents of
# the file. Also works for resp.body in vcl_synth.
sub vcl_backend_error {
  set beresp.body = blob.encode(HEX, blob=synth_body.blob());
}

BOOL xreader.error()

Return true if and only if an error condition was determined the last time the file was checked. This is a way to avoid VCL failure in error conditions.

Example:

if (rdr.error()) {
  call do_file_error_handling;
}

STRING xreader.errmsg()

Return the error message for any error condition determined the last time the file was checked, or a message indicating that there was no error.

Example:

import std;

if (rdr.error()) {
  std.log("rdr error: " + rdr.errmsg());
  call do_file_error_handling;
}

BOOL xreader.deleted()

Return true if and only if the file was found to have been deleted the last time the file was checked.

As discussed in File deletion and file updates above, this is not an error condition, if the file had been previously mapped. Then the previously cached contents continue to be valid.

Example:

import std;

if (rdr.deleted()) {
  std.log("file deleted, continuing with the current cached contents");
}

BYTES xreader.size()

Return the size of the file as currently cached. Invokes VCL failure if the most recent update check encountered an error.

Example:

# Use the cached synth body if non-empty, otherwise use the standard
# Varnish Guru Meditation.
if (synth_body.size() > 0B) {
  synth_body.synth();
}

TIME xreader.mtime()

Return the modification time of the file determined when it was mostly recently checked. Invokes VCL failure if the most recent update check encountered an error.

Example:

# A VCL TIME is converted to a string as an HTTP date, and hence is
# suitable for the Last-Modified header.
set resp.http.Last-Modified = rdr.mtime();

DURATION xreader.next_check()

Return the time remaining until the next check will be performed.

Example:

import std;

# Set the downstream caching TTL to the time remaining until the
# next update check.
set resp.http.Cache-Control = "public, max-age="
  + std.integer(duration=rdr.next_check());

STRING version()

Return the version string for this VMOD.

Example:

std.log("Using VMOD file version: " + file.version());

ERRORS

Methods that access a file's cached contents invoke VCL failure if there was an error during the most recent update check, just as if return(fail) had been invoked in VCL. This means that:

  • If the error occurs during vcl_init (on the initial read of the file), then the VCL load fails with an error message.
  • If the error occurs during any other subroutine besides vcl_synth, then a VCL_Error message describing the problem is written to the log, and control is immediately directed to vcl_synth. In vcl_synth, the response status (resp.status) is set to 503, and the reason string (resp.reason) is set to "VCL failed".
  • If the error happens during vcl_synth, then the VCL_Error message is written, vcl_synth is aborted. The response line "503 VCL failed" is set, but the client may just see connection reset.

The reader.error()_ may be used to detect errors, for example to implement different error handling in VCL.

Errors that may be encountered on the initial read or update checks include:

  • The file cannot be opened for read. This is what will happen for typical file errors: the Varnish process cannot access the file, or the process owner does not have read permissions.
  • The file does not exist at initialization time. As discussed above, this is not an error for an update check, if the file has already been mapped.
  • The file is neither a regular file nor a symbolic link that points to a regular file.
  • Any of the internal calls to map the file fail.

REQUIREMENTS

The VMOD currently requires the Varnish master branch, and is compatible with Varnish version 6.3.0.

INSTALLATION

See INSTALL.rst in the source repository.

LIMITATIONS

Cached file contents (mapped with mmap(2)) consume virtual memory space. This can become a burden if large files are cached, and/or if they are cached by many VMOD objects in many VCL instances.

File caches are unmapped, and timers are deleted, when the VMOD's reader objects are finalized. This happens when the vcl.discard_ command is used to unload VCL instances. While it is not uncommon for Varnish admins to neglect vcl.discard, it can become a resource leak if too many obsolete VCL instances that use VMOD file are allowed to accumulate. Consider implementing a housekeeping procedure to clean up old VCLs.

If the file unmappings and timer deletions fail during object finalization, error messages are written to the Varnish log using the tag Error (visible with raw grouping). While these errors are unlikely, if they do happen, they may be indications of resource leaks. Consider monitoring the log for such errors.

Log messages from the VMOD begin with the prefix vmod file. A VSL query can be used to craft a varnishlog(1) invocation that filters out the VMOD's messages:

varnishlog -g raw -q 'Debug ~ "^vmod file" or Error ~ "^vmod file"'

It is platform-dependent whether file I/O is incurred during the first request/response transactions that read file contents, or whether at least some of the I/O work is done at initialization, and after file contents are newly mapped by an update check. The VMOD provides a hint that the mapped file contents may be used imminently (using posix_madvise(3) with WILLNEED); the kernel may respond by reading ahead in the file mapping. But that decision is left to the kernel.

SEE ALSO

Copyright (c) 2019 UPLEX Nils Goroll Systemoptimierung
All rights reserved

Author: Geoffrey Simmons <geoffrey.simmons@uplex.de>

See LICENSE