Week 47: Classifying a Linux Knowledge Base

3 comments

I was talking to a new freelance writer for LDN this afternoon, and when the subject came up about how I decided what topics to cover.

This is a valid question for most commercial publications, because writers understand that editors have to focus content to a specific audience. What will pique the readers' interests? CNET News or Jupitermedia will not feature too many stories on Linux, for example, because they believe that (rightly or wrongly) not many of their readers will be interested in a lot of Linux stories. There is a common misconception that "mainstream" media sites will avoid topics on the behest of the advertisers. Typically this is not the case: sheer numbers drive the choice of content, which in turn drives the amount and type of advertising sold.

But I found myself in the happy position of being able to tell this writer that since LDN is not a commercial entity, we are not constrained by such parameters. While we certainly welcome all the traffic we can handle, the idea here is not just to inform readers once then move on, but to build something more permanent: a knowledge base. Due to that mission, my selection criteria is far more simple: if it's about developing with or for Linux and it's not redundant information, we want it for LDN.

While my editorial selection process is greatly simplified on LDN, it does present me with the daunting task of trying to organize this growing storehouse of information. When looking for information on classifying documentation around the Linux space, it soon became apparent that there really hasn't been a big effort to create a global classification system for documentation.

This is a big deal for anyone that wants to provide users with an organized way of finding Linux information. Like me. I surfed around to the major distributions' and desktop environments' web sites this week to see how they did it, and it seems everyone uses a different way to organize their knowledge. In the age of Google, this might not seem like such a problem, since a good keyword search can usually dig down to the document a user needs. It is a significant hurdle, however, as more ISVs and defecting Windows developers come into the Linux world and start looking for specs, API documentation, and manuals that will help them figure out how to get started.

Which leaves me in the position of coming up with a classification system for LDN's knowledge base. It will be a system that I hope other sites might adopt, too. A universal classification system would enable developers to quickly find the documents they need. It would also greatly assist project owners to identify gaps in their project's documentation. If Project A has Specs X, Y, and Z, then the owners of Project B might ask themselves why they are missing Spec Y, then fill that gap.

For those of you who might be reading this and thinking "oh, no, not another standard," I don't think such a classification system has to be that rigorous. It just has to have a common set of labels that any project can adapt to fit their needs.

What would such a classification system look like? I'm still pondering that question, because this is a big taxonomy system that needs built. I think at the "kingdom" level there needs to be two groups of documents: descriptive documents (specs, etc.) and implementation documents (manuals, etc.). The rest will be coming as I think about it some more.

But is this indeed needed? Or should I just free tag everything? Constructive input, as always, is welcome!

4
Average: 4 (1 vote)
Tagged with knowledge base | ldn | General
Information Architecture - Submitted by Jed Cousin on Wed, 11/26/2008 - 03:26.

An official documentation structure would definitely be useful. It might be best to use more of a rolling-release model rather than set major.minor.micro versioning of any taxonomy. Definitely utilize free tagging and bits of short, descriptive metadata and see what type of taxonomic structure develops from these more informal methods.

I'd be willing to wager any evolved (and evolving) taxonomy would not only help determine the granularity of classification required of a more formal, vetted system; but would also indicate which areas of documentation and organization are lagging behind the rest of the library.

An aside, MSDN and ADC typically can not or do not include hardware specifications and technical references along side their software docs. If LDN can obtain consent from the hardware manufacturers themselves, LDN could even document hardware. From formal specs, to errata, to practical information from developers in the trenches. Some firms do offer documentation that bridges the hardware-software divide, but it tends to be for a small range of gear running walled-garden platforms. The liberty and ubiquity of the Linux ecosystem, with the Linux Foundation as a primary steward of that system, has a unique opportunity to re-cast a rather typical knowledge-base into a leading-edge nexus of information and empowerment.

"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away." --Antoine de Saint Exupéry

Re: Information Architecture - Submitted by Brian Proffitt on Wed, 11/26/2008 - 15:29.

I think the identification of holes in our knowledge will be one of, if not the most, useful thing that comes out of this kind of organization.

And, oh yes, any hardware specs we can get will be right in there, too.

Brian Proffitt
Community Manager
Linux Developer Network

Re: Information Architecture - Submitted by Jed Cousin on Wed, 11/26/2008 - 17:12.

Are there plans to keep any knowledge-base strictly in an online format, or to have it derived from (or converted to) an intermediate documentation format? (e.g LaTeX, OpenDocument Text.)

Of late, I've been mucking about with Doxygen as a unknown code-base investigative tool. It isn't what I would consider ideal, but I've been impressed at Doxy's ability to ruthlessly cross-reference relevant material. I mean, how cool would it be to be able to drill-down from libc's stdout into the different hardware specs for textual output devices and then move back up into source or the spec? Or to be able to examine past and present bug reports relevant to the specification, software and/or hardware a developer is investigating. (Though I understand that the goal of LSB is to provide compliance requirements without necessarily requiring extant source code to be used to obtain LSB compliance.)

Adding in graphical navigation and contextual information about relationships between entries would be an even bigger win. Graphviz has its place, but VTK or the interactive InfoVis Toolkit could make it all the easier to bring new developers up-to-speed.

"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away." --Antoine de Saint Exupéry

Copyright © 2008 Linux Foundation. All rights reserved.
LSB is a trademark of the Linux Foundation. Linux is a registered trademark of Linus Torvalds