/unix-in-lisp

Mount Unix system into Common Lisp image

Primary LanguageCommon Lisp

Unix in Lisp - mount Unix into Common Lisp image

Unix in Lisp is currently a usable Lisp (not POSIX!) shell for Unix. The distinctive feature of Unix in Lisp is that rather than creating a sub-language for Unix operations, Unix concepts are directly/shallowly embedded into Lisp (Unix commands become Lisp macros, Unix file become Lisp variables, Unix streams become lazy Lisp sequences, etc).

The fact that Unix in Lisp is Lisp, rather than an interpreter written in Lisp, makes it possible to leverage existing tools for Lisp. One instance is the Unix in SLIME listener, which inherits completion, interactive debugger and multiple listeners etc from SLIME itself. Another instance is that existing CL library such as sequence helper functions from serapeum works out of the box on Unix in Lisp process streams.

Table of Contents

Quick Start

Currently, only SBCL is supported. Clone this repository into your ~/quicklisp/local-projects/. Unix in Lisp is currently at alpha stage and will receive frequent changes. It’s recommended to use Ultralisp to install dependencies, to make sure various bug fixes to the upstream are available which Unix in Lisp relies on. If you haven’t done so, (ql-dist:install-dist "http://dist.ultralisp.org/" :prompt nil). Before first-time use, run (ql:update-dist "ultralisp") and (ql:quickload "unix-in-lisp") to install all dependencies. It’s also advised to (ql:update-dist "ultralisp") and git pull this repo regularly to get updates and bug fixes.

It’s recommended to load unix-in-slime.el for better SLIME integration. To load it, evaluate (require 'unix-in-slime "~/quicklisp/local-projects/unix-in-lisp/unix-in-slime") in emacs. You may want to add this line to your init.el. Then M-x unix-in-sime to start a listener, and have fun!

unix-in-slime installs hacks to the host Lisp environment by calling (unix-in-lisp:install) on startup. To undo hacks done to the host environment and unmount Unix FS packages, run (unix-in-lisp:uninstall).

Examples

(Some print-outs are omitted)

Counting number of files

/Users/kchan> (cd quicklisp/local-projects/unix-in-lisp)
/Users/kchan/quicklisp/local-projects/unix-in-lisp> (pipe (wc -l) (ls))
       9

But why not the Lisp way as well!

/Users/kchan/quicklisp/local-projects/unix-in-lisp> (length (package-exports ./))
9

For more examples, see TUTORIAL.org

Documentation

File System Mapping

Directories are mapped as Unix FS packages. A Unix FS packages is any Common Lisp package whose package name designate an absolute path name (usually when it starts with a slash).

The exported symbols of a Unix FS package should one-to-one correspond to files in the mapped directory. Exceptions to this one-to-one correspondence:

  • Because of the limit of file system change tracking, the package structure in the Common Lisp image may diverge from the Unix FS state.
    • Currently, the state of a Unix FS package is synchronized when calling mount-directory. By default, remount-current-directory is added to *post-command-hook*, which does the obvious thing.

Each of these exported symbols has a global symbol macro binding, so that they can be read/write like Lisp variables. Access to the symbol gives the list of lines of the underlying file, and setting it with a list designator of lines cause them to be written to the file.

unix-user /Users/kchan/> .bashrc
("export CC=\"clang\"" "export PS1='$(hostname):$(pwd) $(whoami)\\$ '" ...)

Note that there are no corresponding symbol for a non-existent file. To write or create a file that you are not sure whether it already exists, it’s recommended to use defile macro, which will ensure the file exists and creates the corresponding symbol.

unix-user /Users/kchan/> (defile iota.txt (iota 10))
/Users/kchan/iota.txt
unix-user /Users/kchan/> iota.txt
("0" "1" "2" "3" "4" "5" "6" "7" "8" "9")

In the above example, if iota.txt does not exist and I use setq instead of defile, an internal symbol named IOTA.TXT will be created in UNIX-USER package instead and I will write to its value cell, rather than /Users/kchan/iota.txt on the file system.

Command and Process Mapping

Unix in Lisp manages jobs in the unit of Effective processes. Theses include regular Unix processes represented by simple-process, and pipeline’s which are consisted of any number of UNIX processes and Lisp function stages.

Simple commands

When Unix in Lisp maps a directory, files are checked for execution permission and executable ones are mapped as Common Lisp macros. These macros implicitly quasiquotes their arguments. The arguments are converted to strings using literal-to-string, then passed to the corresponding executable.

Examples of using macros mapped from Unix commands

/Users/kchan/some-documents> (cat ,@(ls))
;; This cats together all files under current directory.

You can also set up redirections (and maybe other process creation settings in the future) via supplying keyword arguments. These arguments are not implicitly quasiquoted and are evaluated.

/Users/kchan/some-documents> (ls :output *terminal-io*)
;; This outputs to *terminal-io*, which usually goes into *inferior-lisp* buffer.
/Users/kchan/some-documents> (ls :error :output)
;; This redirect stderr of ls command to its stdout, like 2>&1 in posix shell

Like you have discovered in (cat ,@(ls)), effective processes can be used like Lisp sequences – they designate the sequence of their output lines.

Pipeline

Pipelines are created via the pipe macro:

/Users/kchan/quicklisp/local-projects/unix-in-lisp> (pipe (wc -l) (ls))
       9

Under the hood, except the first stage, each stage of the pipeline is passed :input <result-of-previous-pipeline-stage> as an additional argument. Alternatively, if there are arguments _, they are substituted with the result of the previous stage. You can mix Lisp functions and values with Unix commands. Using Lisp value as the first input stage is easy enough:

/Users/kchan> (pipe (iota 10) (wc))
      10      10      20

The _ extension make it easy to add Lisp functions to the mix:

/Users/kchan> (pipe (ls) (filter (lambda (s) (> (length s) 10)) _) (wc -l))
      47

The above counts the number of file with filename longer than 10 under my home directory.

Interactive Use

Inside a unix-in-slime listener, if the primary value of evaluation is an effective process and it has avaliable input/output streams, unix-in-slime automatically “connect” it to the listener, i.e. I/O of the listener is redirected to the process, similar to foreground processes in POSIX shell:

/Users/kchan> (python3 -i)
Python 3.8.9 (default, Apr 13 2022, 08:48:07)
[Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello world!")
Hello world!
>>> ; No values
/Users/kchan>

Attention: use C-u RET to signal EOF in unix-in-slime, similar to Ctrl+D in POSIX shells. You can interrupt evaluation via C-c C-c like usual, after which you will be provided a few restarts:

  1. BACKGROUND puts the job in background (accessible via unix-in-lisp:*jobs*)
  2. ABORT terminates the current job (via SIGTERM for Unix processes)

Attention: You have to use -i flag to start Python REPL, because Unix in Lisp currently talk to all processes using pipe rather than pseudo tty. Without -i, Python will start itself into non-interactive mode. Other REPLs may need respective flags.

When using Unix in Lisp outside unix-in-slime, use (unix-in-lisp:repl-connect <process>) to achieve the same thing.

unix-in-lisp:*jobs* keeps a list of running effective processes:

unix-in-lisp> *jobs*
(#<simple-process python3 (running) {1005BFFCF3}>)

Note that because unix-in-slime listener connects a job automatically if it is the primary value of evaluation, you can use e.g.

unix-in-lisp> (nth 0 *jobs*)

to resume from a background job.

unix-in-lisp:repl-connect connects a process exclusively in at most one listener. If a process is already connected in other listener, it will do nothing and the effective process object will be printed like normal. In fact, many Unix in Lisp operations (including repl-connect and pipe) takes exclusive access of input/output stream of processes (by setting the respective slots to nil during their course of operation).

Environment

Unix environment variables are mapped to special (dynamic-scope) Lisp variables.

/Users/kchan> $logname
"kchan"

You can set them or dynamically bind them

/Users/kchan> (setf $test "42")
"42"
/Users/kchan> (pipe '("echo $TEST") (bash))
42
nil
/Users/kchan> (let (($test "override")) (pipe '("echo $TEST") (bash)))
override
nil

The above works with the help of a reader macro defined on $, which registers the following symbol as an environment variable. If you want to use Unix in Lisp environment variables without our readtable, you need to use function unix-in-lisp:ensure-env-var to register the symbol first. Consult its docstring for more information.

Unix in Lisp keeps its own idea of a Unix environment, and pass to subprocesses created by it (e.g. via the macros it created from Unix commands). Other Lisp facilities (e.g. uiop:run-program) does not know this, and usually inherit the “real” Unix environment of the Lisp process instead. To remedy this, Unix in Lisp provides function unix-in-lisp:synchronize-env-to-unix which copies the environment Unix in Lisp manages to the “real” Unix environment of the Lisp process. This is by default run in *post-command-hook*, and you may want to call them before using other Lisp facilities that spawns Unix subprocesses.

Scripting (blazingly fast start up)

The recommended way to write scripts is to create executable files (say do-stuff.sh) with contents like

#!/usr/env/bin sbcl --script
(asdf:require-system "<dependency>")
(asdf:require-system "unix-in-lisp")
(unix-in-lisp:setup)
<do-stuff>

The benefit of the above approach is that it is blazingly fast when started from within Unix in Lisp (via e.g. (do-stuff.sh)), because Unix in Lisp has a Fast loading command mechanism, which can execute the script within Unix in Lisp image without starting subprocess if it detects a Lisp shebang. The essence of writing fast startup script is:

  1. Use #!/usr/env/bin sbcl --script shebang. Currently it has to be an exact match.
  2. Use asdf:require-system. This avoids scanning the ASDF registry directory tree for modification, which wastes significant time!

On my machine, a hello world using the above approach run in 0.5ms, while Python 3 uses 30ms!

Unix in SLIME

The above documentations have been assuming you are using the unix-in-slime listener. Here we document some additional aspects of unix-in-slime.

Unix in Lisp assumes a dedicated swank server for unix-in-slime listeners (and potentially other front-ends in the future). M-x unix-in-slime will start one on unix-in-slime-default-port (4010 by default) if none already exists in the Unix in Lisp image. The server handles multiple connections, so you can safely start multiple unix-in-slime listeners simultaneously, like how you must have lived with multiple terminal windows.

Performance: fast!

A quite unintended achievement is that unix-in-slime is a very fast shell for Emacs. In fact, a simple (pipe "time for i in {0..99999}; do echo line $i; done" (sh)) benchmark takes 0.83s in unix-in-slime, and takes 2.93s in vterm. unix-in-slime is more than 3 times faster than one of the fastest Emacs terminal emulator (partly written in C)! Of course, this is not a head-to-head comparison because vterm is a terminal emulator while unix-in-slime is a shell, but I did frequent experience fast command outputs choking Emacs and it’s good to know unix-in-slime is pretty good at handle these. I think the reason is that SLIME’s swank server does some very Emacs-specific tuning, e.g. limiting network packet rate because it knows Emacs choke on a flood of them, which also benefits us when we use it as a shell.

Completion

If you have configured completion for SLIME, completion works out of the box for unix-in-slime. Note that we automagically get “filename completion”, because they are mapped as symbols, and we have symbol completion at home! Currently there’s one quirk: filenames are always completed to their fully resolved path (with .. . ~ components resolved), because that’s what corresponds to symbols. I’d say it’s either a bug or a feature depending on who you ask, I’m leaving it like that for now.

Package system structure

Unix in Lisp defines and populates a number of packages during unix-in-lisp:install. First, unix-in-lisp:path is created according to $PATH environment variable. Then, unix-in-lisp.common is ensured to re-export unix-in-lisp.path, and also export symbols corresponding to environment variables. Packages that wish to make use of Unix in Lisp functionalities should use unix-in-lisp.common, and potentially shadowing import some of its symbols. Any other usage of packages created by Unix in Lisp is less safe, including using or importing symbols from the Unix FS packages, particularly because invoking unix-in-lisp:uninstall deletes them.

The Unix in SLIME listener by default starts in unix-user package, which uses unix-in-lisp.common and other utility packages. This causes all listeners to share the same package by default, but you can also create new packages and switch listeners to it. Note that we do not support current directory by using its corresponding Unix FS package. Instead, a reader hook (to sb-impl::%intern) is installed that replace symbols denoting relative path with a new “effective” uninterned symbol that merges bindings from the original symbol and the mounted symbols according to the relative path under current directory (*default-pathname-defaults*). Similar to Unix, our redirection never shadows existing global function bindings, to avoid unintentionally execute files under current directory.