Scripting Languages Play Role in LSB

Tagged with Regular Expressions

This month's theme at Linux Developer Network is LSB: Linux Standard Base. While LSB generally has focused on C and C++, in fact it bears on several other programming languages. Let's look at what LSB 4.0 offers, and what it means for your own programming.

LSB is a standard for programming interfaces, as its home page on this site describes. LSB helps with the gap between, "It works (at least on my personal desktop)!" and, "Here's a well-behaved product ready for delivery to customers."

"Challenges and Variances"

That gap is greater than many junior programmers realize. Even if we restrict attention to Linux 2.6, for instance, and well-coded 32-bit portable C, different Linux distributions expose subtly but significantly different run-time environments: libraries aren't all at the same release, configuration files homed in /etc on one platform appear in /opt on another, and so on. LSB is a framework for resolving what the project labels "challenges and variances" so that developers can have confidence their applications will work properly on any supported platform. LSB addresses eight distinct binary images, including flavors of x86, PowerPC, Itanium, and IBM's zSeries.

Language support beyond C/C++ ranges over three broad categories:

  1. Release 4.0 of LSB introduces support for Java on a "trial-use" basis;
  2. LSB leverages POSIX whenever possible, so LSB largely adopts POSIX standards for sh, sed, awk, yacc, make, and so on. The correspondence is close, but not exact: internationalized regular expressions in awk, for instance, differ slightly between the POSIX and LSB;
  3. Perl, Python, and Tcl, the three most prominent open-source application scripting languages for Unix platforms during the 1990s, all show up in LSB.

Later this month, LDN will detail Java's role in LSB. This article, therefore, focuses on the second and third of these categories.

Small Niches

The languages POSIX specifies are now generally used only for special purposes and rarely used to implement stand-alone, productized end-user applications. Exceptions exist: Keith Winston mentions that he has "worked on a 12,000-line awk application used to adjudicate dental claims," for instance. As impressive as such feats are, they constitute a tiny minority of current application work. Few programmers use them as general-purpose development languages in the way they do C, Java, or Perl.

POSIX tools are important to application development in 2008 in a couple of regards:

  • They often show up as "helpers" in special-purpose report generation or installation. Even large programs coded in C, for example, occasionally "call out" to a tiny subordinate sed process for such chores as transformation of a server log. C has abundant libraries to code such a task natively; sed is designed for just such uses, though, and is far more succinct in the cases where it applies. The virtue of LSB in this situation is that it provides a guarantee about the availability of these domain-specific tools. From an architectural standpoint, it's perfectly legitimate to assign small parts of a Java or C/C++ project to be coded in awk or sed;
  • sh constitutes a special case. It's certainly used as in the previous paragraph, especially during installation: plenty of applications coded entirely in C or Java rely on an install.sh or similar for proper configuration and initialization. Beyond this, though, sh has grown through the years to become a marginal general-purpose programming language. Moreover, sh is a special case in that it varies particularly widely between installations: while distributions and particular sites generally leave other language executables alone, it's common to symlink /bin/sh of /user/bin/sh to dash, bash, pdksh, or others.

LSB Application Checker

LSB 4.0 helps with the latter situation: it includes a tool which checks yourscript.sh for compatibility among the leading shells. Note, by the way, that "sh" in an LSB context is always the "Unix shell script"; don't confuse it with the entirely different Sh metaprogramming language and library.

LSB is a big and ambitious project, to the point of "heaviness"; it's designed to solve thorny problems that arise for independent software vendors (ISVs) and in other large-scale organizations. With just a few minutes preparation, though, you can try out a few pieces of LSB for yourself.

Among the pieces downloadable for the first LSB 4.0 beta is the "Linux App Checker." "Getting Started" explains the use of the App Checker; don't expect to follow it exactly, though, because it hasn't kept up with software: some files install to different locations than the documentation claims, at least for some distributions. Moreover, "Getting Started" emphasizes use of the graphical user interface (GUI) with large projects. We can simplify matters, though, to get a feel for LSB's capabilities:

First, install LSB. While LSB packages its artifacts as RPMs, they also behave well on Debian-based distributions. Create /tmp/1.sh with the lines

          #!/bin/sh

	  x[0]="Hello"
	  echo ${x[0]}
       

Next, execute /opt/lsb/app-checker/utils/run_tests.pl --f /tmp/1.sh (the exact path depends on the distribution). After a couple of seconds, this will generate a directory of results in /var/opt/lsb/app-checker/results, where you'll find a test_log.html that reports, "List of Problems ... /tmp/1.sh: 3: 'x[0]=Hello' isn't included ..." In English, the application checker is trying to alert you that array references like x[0] are "bashisms" not available to LSB's standardized /bin/sh.

This is valuable. While we certainly don't insist that all our own shell scripts be LSB-compliant, it's enlightening to be able at such low cost to learn which ones are, and the specific divergences of those that aren't.

Other Scripting Choices

sh isn't the only choice for those who develop for LSB in high-level languages. The 3.2 release of LSB from January 2008 introduced a mandatory specification, "LSB-Languages," along with two new checkers, lsb-appchk-perl and lsb-appchk-python, to support it. These work just as lsb-appchk-shell does, so the previous illustration with /tmp/1.sh is easily adapted to Perl- or Python-based programming.

LSB 3.2 requires default installation of Perl 5.8.8 and Python 2.4.2, or greater, respectively. The 4.0beta specification leaves these version requirements unchanged.

As mentioned above, it was common in the '90s to regard Perl, Python, and Tcl roughly as peers; all three were within an order of magnitude of each other in popularity, capability, installed base, portability, and so on. LSB-Languages now mentions only Perl and Python, though; what place in LSB is there for Tcl?

An interesting one, in fact. In mid-2007, during public discussion of what would become LSB-Languages, LSB Chairman Mats Wichmann noted that, "We get... comment[s] that LSB needs to specify ... (much less frequently) other interpreted languages such as TCL, Ruby, PHP." Therefore, PHP, Ruby, and Tcl simply are not part of LSB-Languages.

That's not the end of the story, though. While Tcl in particular can't boast an app-checker, it does play a role in LSB. Expect is a tool that was used early in LSB's history for test automation. While it no longer appears in this role--LSB now has its own test framework--LSB continues to use Expect to help construct packages.

Therefore, although LSB doesn't specify Expect, it's safe to assume that any platform which supports LSB provides Expect.

What does this have to do with Tcl? Many Expect users know Expect only as a special-purpose tool for certain automations having to do with testing, package construction, network management, or related areas. Expect indeed has a distinguished history in these roles, and it's easy to understand its identification with them. What many frequent users of Expect don't know, though, is that it is a strict superset of the Tcl general-purpose high-level language. If you have Expect, you necessarily have all of Tcl, and can program anything base Tcl makes possible.

For an ISV, then, Tcl certainly doesn't have the same standing within LSB as Perl and Python; there's no definite specification of Tcl's precise cross-distribution capabilities as there is with Perl and Python. At the same time, given LSB's dependence on Expect, there is a guarantee that a recent Tcl is available on each LSB-compliant system. Although an ISV can't appchk its Tcl-based programs, it still can count on Tcl's standardization with LSB in a way Ruby and PHP don't yet enjoy.

In practical cases, many Tcl applications depend on extensions--the Tk one for GUIs, for example--that LSB doesn't address at all. Vendors and other programming teams will have to go outside LSB to certify availability of these resources. This is frequently the case for real-world applications, though: even those written in C or Perl or Python frequently depend on libraries outside the scope of LSB.

Summary

LSB's guarantees serve more than just C programmers; it plays a role in providing a predictable development environment also for those who work in C++, sh, Perl, Python, and even Tcl.

Kathryn and Cameron run their own consultancy, Phaseit, Inc., specializing in high-reliability and high-performance applications managed by high-level languages. They write about high-level languages and related topics in their "Regular Expressions" columns.

 

4
Average: 4 (2 votes)