xmlsh - XML Shell: Philosophy

Revision [205]

This is an old revision of Philosophy made by DavidLee on 2008-06-01 05:37:42.

Philosophy

In the opinion of this author, the Unix "shell" (Wikipedia) is one of the most elegant, practical and powerful command and scripting languages ever written. It has stood up to 30 years of use and abuse, and while it has been copied, cloned, rewritten, duplicated, "improved" and otherwise mimicked in many variants of unix and other operating systems, the core design, functionality, and syntax has not improved significantly, nor needed to be improved. Those who grew up using "modern" Windowing systems or primitive CLI's (Command Line Interpreter) (Wikipedia) may not fully appreciate the elegance and functionality of the unix shell (originally the Bourne shell, /bin/sh) and its variations.

For this discussion the term "Unix Shell" refers to any of the unix shells derived from the Bourne Shell (/bin/sh) although in particular the Bourne Shell as it embodies the earliest and cleanest design. Some offshoots such as the "C Shell" (csh), in an attempt to add features, loose such significant amounts of the original design philosophy that I consider them not in the same 'camp'. Other languages, such as perl, which were derived, or at least inspired from, the Unix Shell, are so different that I consider them in their own category as full fledged programming languages.

Unix Shell Design

It is beyond the scope of this document to fully describe in detail the design of the Unix Shell. The following is a rough overview of the major design points which make the Unix Shell elegant, unique and powerful as they implement the philosophy of design. It is these design points and philosophy which xmlsh attempts humbly to borrow from the unix shell. An important consideration and distinguishing factor of a "shell" as apposed to a CLI or a programming language is that a Shell is designed to optimize 2 very different use cases simultaneously. This is a very difficult thing to do and usually fragments computer languages into different categories.

For this discussion, I make a distinction between a shell and a login shell.
A quick "litmus test" of a "login shell" vs a "shell" is would you configure that program to be your 'login shell' on a unix system (the first program that runs when you login). I know many people boast from time to time (myself included) how their favorite program dejour is so powerful it could be a login shell (common examples being perl and emacs) but beyond the occasional playful experimentation, this never happens.

On the other hand a shell may have a bit lower bar if its designed for a subset of tasks, and may not be designed for, or appropriate for everything.

In any case, all shells have the following use cases in mind.

* Command Line Interpreter

A shell must function as a Commmand Line Interpreter. That is, it must funciton well as a primary interface to humans operating a computer using text (keyboard). Early shells ran on "Line Printers" and Teletypes (TTY), and were later modified to work well with "Terminals" (dubbed "Glass TTY's"). Humans type in via a keyboard a "command" to execute and the CLI interprets that command and invokes the necessary process. Dedicated CLI's typically perform this job very well, implementing various sorts of line oriented editing features and simplifications to allow a short command to execute any pre existing process (program) on the computer.

* Scripting

A shell should also function as a scripting language. That is, often the process one wants to describe to a computer is more complex then can be described by typing in a line or 2 without mistakes. CLI's often implement scripting by allowing multiple lines (identical to what you would have typed in) to be stored in a file and the file executed with one command. This is certainly useful, but it is a tiny step. Usually flow control (if/while/goto), variables, and other more exotic features are needed or desired in order to describe a process.

Although most users think of the shell as an interactive command interpreter, it is really a programming language in which each statement runs a command. Because it must satisfy both the interactive and programming aspects of command execution, it is a strange language, shaped as much by history as by design.

â€“ Brian Kernighan & Rob Pike , The UNIX Programming Environment", Prentice-Hall (1984).

There are many shells and CLI's that manage both of these use cases, but very rarely are both achieved equally well. One hurdle is that typically the CLI and Scripting cases are not entirely interoperable. That is, you can have commands in a "script" which behave differently when run interactively, or simply are not supported interactively. This makes writing scripts much more difficult then necessary because you cant manually "try out" something intended for a script on the command line. Sometimes this is intentional as the shell's authors design philosophy is that scripts are inherently different then interactive use. Other times it is due to fundamental limitations or design of the operating system. Since shells typically have evolved as the first layer above the OS (and in fact that's why the term "shell" is used), they often reflect the design philosophy of the operating system itself more then their own design.

The Unix Shell adds some additional fundamental design philosophy which is key to its elegance and power.

* The TTY is a file.

In the Unix Shell, the "console" or the "TTY" is considered identical to a file. That is, everything you can do in a script you can do interactively, and visa-versa. This is fundamentally a reflection of the design of the Unix operating system where the console is of the same "type of thing" as a file and for the most part is indistinguishable - they are both a "file" and use the same core system calls to access. While there are a few things you can do to a disk file that you cant do to a tty, and visa versa, both the OS and the Unix Shell attempt to hide these inconsistencies as much as possible. The end result is that pretty much anything you can put into a "script" you can type interactively and it will have exactly the same effect. This merging of the "Scripting" and "CLI" use cases is extremely powerful, and makes authoring, debugging and using scripts a much more simple and natural process then on systems where scripts and CLI behave differently.

* Tools and Pipes

The single most unique and powerful feature of the Unix shell is the concept of tools. All shells prior to the Unix Shell, and most shells since are based on Monolithic Processes.
That is, a single process (program, or executable) is designed to do a complete job all by itself. The purposes of both the CLI and scripts is to setup the parameters for a monolithic process, such as its input and output files, and options, Execute the Big Process then clean up. Sometimes very large tasks created by stringing together multiple monolithic processes to do many things in order, for example, format a document then send it to a printer. But no matter how many monolithic processes you string together in series, the complexity of what can be done is linear. That is, nothing new is created out of the combination of processes. At best a long tedious sequential task is turned into a single command (script), but its still fundamentally a sequence of monolithic processes that had to be pre-created (the program created prior) and they can do nothing more then was intended by the original programmer. The Unix Shell is designed differently at a very fundamental level. While certainly there are monolithic programs that do only one thing, a core philosophy is that you can have small programs that do very little, but that can inter-operate with other programs seamlessly to create new programs. I like to think of this as "Lego". Using the Toy analogy, the Monolithic model is like a having a toy box which has fully complete toys in it, like a Car, an Airplane, a stuffed bear. These toys were pre-built by the toy builders to do only one thing and do it well. The tools and pipes model is like having Lego's where each piece is both specialized and universal. With Lego you can build your own car or airplane, or space ship or anything else, not limited to what was pre-built.

To implement this design, the core set of unix tools (and future designed tools are encouraged to be written similarly) are specifically built to inter-op. Like legos, they naturally have fitting 'connectors'. Again, this is a natural reflection of the underlying operating system (Unix). There are 3 types of unnamed connection points. On start-up a process is given the "command line" as an array of strings. At run-time have access to 3 universally implemented streams on start-up (stdin, stdout, stderr). On exit, a process exits with an integer exit code. In addition, processes have access to named connection points (namely files).
The Unix Shell elegantly manages these connection points in simple and expressive ways. Inputs streams can be connected to outputs via pipes (|). Output streams can be connected to command line arguments. Exit values can be connected to command line arguments. With the philosophy of small interconnected tools, even flow control was originally implemented as separate command processes (although later optimized as built in commands). Furthermore, hierarchically, groupings of operations can be connected to atomic or other groups, indefinitely. This connectivity allows both linear and hierarchical construction and hence the complexity (how many different things you can do) with a set of primitives is no longer linear, it is exponential.

xmlsh Philosophy

In Unix (and the Unix shells) the primary complex data type is the stream. A stream is a sequence of bytes that has a beginning and an end and can be read sequentially, or sometimes, but not always, as a whole. All streams are treated uniformly. Examples of streams are Pipes, Files, Terminals. The "|" operator in the Unix Shells connects 2 streams such that output from one is joined to the input of the other. The "<" and ">" redirect streams to and from files. The back-tick (or later $() ) operators convert a stream into a single string, typically to present as an argument.

Examples (using only ls and wc)

Count the number of files in the current directory

# Count the number of files in the current directory
$ ls | wc 
   
# Count the number of lines in a single file
$ wc < filename 

# Count the number of lines in all files in the current directory
$ wc $(ls)

These examples work as-is in xmlsh, but they can also be demonstrated using purely xml oriented syntax.
For comparison, using purely xmlsh specific commands the above examples are

# count the files output by xls
xls | xpath 'count(//file)'
# count the number of top level elements in an xml file
xquery 'count(//*)' < xmfile
# count the number of files in the curent directory
set $<(xls)>
echo <[count($_[1]//file)]>

The Stream is essentially the fundamental object of the Unix Shell design. Although one could argue that Files are the fundamental object (be they data or processs files), and streams are simply an interface to files. This is not entirely correct as streams can exist without files (pipes). This argument is resolved by considering pipes as a special type of file (aka an unnamed file). Either viewpoint is correct, but for this discussion, I consider Streams as the primary objects and Files to be a mapping of Name to Stream provided by the operating system. In either case, the fundamental data object represented in either streams or files is the "byte". That is, streams (and files) are sequences of (or collections of) bytes. A subset of the stream is the text stream. A majority of unix tools presume that their streams are not only made up of bytes but that those bytes are encoded in ASCII and represent text. A further specialization is that not only is the streams ASCII streams of text but that they are in lines separated by the ASCII LF character. In the above example, both the "ls" and "wc" commands assume that the streams they are dealing with (input and output) are streams of ascii text in lines. "wc", for example, fails miserably to produce meaningful results if given a different encoding then ASCII or a binary file, for example a PDF or an XML file which may contain essentially the same content as a "text file" does not work with wc.

This has served very well for several generations of data processing needs. While data types have evolved to more and more complex data types, they have all implemented serialization formats which can be represented as a sequence of bytes. This representation (sequence of bytes) is so fundamental to modern computers that its unlikely to change anytime soon. However ASCII Text, and especially ASCII Text Lines, which used to be the format used for the majority of data is increasingly rare. More and more data is structured more complexly then can be easily represented as "lines of text". Furthermore, ASCII or even fixed single byte (8 bit) characters is no longer universally used as the encoding for text. More and more commonly used is Unicode in its various encodings such as UTF8 or UTF16, or 'ASCII piggybacked' encodings such as HTML-Encoded. More and more, the core base of unix tools, and in fact the core of the datatypes used by the unix shells simply don't work correctly or completely with the data we use daily. We find ourselves today (2007) in a transition period of data processing formats where the fundamental axioms are being challenged. In particular, the underlying data model used for the Unix Shells is becoming obsolete. This threatens the entire philosophy and usability of the unix shell.

The philosophy behind the "XML Shell" (xmlsh) is an attempt to salvage the elegant design and philosophy of the Unix Shell, but replace or supplement the fundamental types with new types that can be a robust foundation of a new generation of data processing. That is, replacing the concept of "File" with "XML Document", and "Stream" with "XML Infoset", but otherwise reusing the same core design philosophy and data model.

Furthermore, the implementation of the "XML Shell" is a concession that such a fundamental shift of primitive data types is not going to happen overnight. Hence xmlsh need Interop cleanly with existing shells and operating systems. That is, xmlsh is not intended as a replacement for existing unix shells (yet), but rather as a supplement of both the shell and the common toolsets designed to work as cleanly, simply and elegantly with XML data as the Unix shell works with Text data.