Tcl Simplifies Kernel Programming

Tagged with Regular Expressions

By Cameron Laird and Kathryn Soraiz

September is kernel month here at Linux Developer Network. While that might seem to be at an opposite pole from the "lightweight," "agile" programming in which this column specializes, there are actually many connections between kernel work and scripting. We present tcl-fuse as one definite example.

First, as usual, a bit of context: kernel work is hard. Everyone knows that. At a minimum, in order to make any changes, you must maintain sources (several million lines of them), and a working C toolchain, know your way around C (and possibly assembler), regenerate the executable kernel, and reboot, in order to make any changes--and if those changes interfere with reboot, you've got big problems!

Ecosystem of Agility

More precisely, that's the way it was ten to fifteen years ago, when much of the folklore by which the Linux world still lives first emerged. Practical kernel development has in fact changed quite a bit since then, as other articles appearing this month illustrates.

Maybe it's best to say that kernel work has greater range now. While there still are situations that involve the traditional hours-long regenerate-everything-and-reboot, there are also ways to operate at the kernel level that are so lightweight you can see results in seconds:

  • The boundaries between kernel and "userland" have rationalized somewhat; as we'll illustrate below, there are cases where a domain that used to be in the kernel has moved at least partly outside.
  • At a deeper level, the introduction in roughly mid-1995 of loadable kernel modules to Linux has radically altered practices for device drivers and other kinds of kernel development.
  • Increasingly affordable hardware has had an impact: fifteen years ago, a typical programmer might compile and exercise experimental kernels on the same machine. Now it's unremarkable to have a couple of physical hosts, and removable media with enough capacity to walk an entire kernel image from one to the other.
  • Perhaps most dramatically, the proliferation of virtual-machine technologies and large mass storage means that a programmer can keep several distinct logical machines on his desk, all communicating at (near-)bus speeds.

All these changes have made Linux kernel work quicker for experts, and more forgiving and approachable for newcomers.

Complications

It's still done in C, though, and sometimes rather obscure C. Consider this evidence: "Filesystem in Userspace," or FUSE, is a loadable kernel module that emphasizes simplicity. By the standards of the FUSE project's home page, "Implementing a filesystem is simple, a hello world filesystem is less than a 100 lines long." Even before consideration of the use of FUSE, to which we'll turn in a moment, think what this means. A lot can go wrong in a hundred lines; it's widely believed that production software averages between one and thirty errors per thousand lines of code, so, under the best of circumstances, a 94-line hello.c has only about a 91% a priori chance of being error-free. This source moreover includes a struct initializer, which many C programmers learn only later in their studies.

We're fond of both C in general and the FUSE project in particular, and don't detail these facts as criticism. They're useful context, however, for understanding the place of tcl-fuse. In this package, hello.tcl is 74 lines long--but it includes code that demonstrates use of "helloworld" to instantiate and monitor a filesystem. It's common-place to choose the brevity and expressivity of high-level languages such as PHP or Ruby for Web applications; tcl-fuse shows how the same advantages in productivity are available for at least some aspects of kernel programming.

Suppose you're an advanced filesystem hacker working on a new, mission-critical, and highly specialized implementation; even in the most extreme case--say, you have engineering requirements that the filesystem definition must be compiled into the kernel, rather than loaded after boot-up--you can eliminate days of compilation cycles through use of tcl-fuse as a prototyping tool. Run-time performance is surprisingly close to what a conventionally compiled model achieves, and tcl-fuse has all the capabilities of FUSE itself. The productivity boost of "kernel scripting" can be as dramatic as those we earlier gained through use of virtual machines, for instance.

Virtual Everything

Even if your involvement in filesystem development is less than a full-time profession, tcl-fuse might benefit you. Filesystems are, of course, a key concept in Linux computing and information systems more generally. For historical reasons, though, Unix didn't apply the concept as universally as, for example, the alternative Plan 9 operating system. Think for a moment about an FTP archive; while it looks like a filesystem from the outside, it requires a special FTP client to do anything useful with it. You can't simply cp and ls it from the command-line, and, even more to the point, more interesting programs, such as a multimedia player, can't automatically access its contents.

A "virtual file system" (VFS) changes all that. A VFS such as AVFS exposes FTP archives, zip files, floppies, ssh-accessible remote systems, and a variety of other "media" as VFSs. FUSE spun off from AVFS "to implement a fully functional filesystem in a userspace program... [with] no need to patch or recompile the kernel."

FUSE structure

The result is successful enough to serve as the basis for several more "retail"-oriented projects, such as the ClamFS, an "anti-virus protected file system".

Tcl has its own VFS facilities, so FUSE intrigued Tcl expert Colin McCormack enough to launch a preliminary version of tcl-fuse in early 2005. Then, just in the last few months, Alexandros Stergiakis brought tcl-fuse to a nice Version 1.0 as a Google Summer of Code™ project. The result is sufficiently complete and sophisticated to make every existing Tcl VFS, including ones for CVS, SQLite, and WebDAV, immediately available as a FUSE VFS--and all the features are fully scriptable!

We'll return in a few months with more details on how to program with tcl-fuse, and perhaps comparisons with related projects based on Perl and Python. For now, our aim is simply to open the topic of the place of high-level languages in kernel development, show that useful work in this area is already real, and hint at the exciting possibilities this enables. Think what programmable filesystems can do for you!

Kathryn and Cameron run their own consultancy, Phaseit, Inc., specializing in high-reliability and high-performance applications managed by high-level languages. They write about high-level languages and related topics in their "Regular Expressions" columns, and generally treat kernel modifications as a last resort.

 

0
Copyright © 2008 Linux Foundation. All rights reserved.
LSB is a trademark of the Linux Foundation. Linux is a registered trademark of Linus Torvalds