An Introduction to

Configuration Data Base Management

 

Bob Goeke

Last Revised 22 July 2013

 

 

Overview

 

At the most basic level, itÕs really easy.  You put all new documents, hopefully in both native and PDF formats, in a directory called

 

      in_basket/

 

and then issue the command

 

            stuffit

 

which files the documents away in the projectÕs data repository.

 

You then edit – in plain ASCII text – a file

 

      parts.idb

 

adding, subtracting, changing the information therein; then execute

 

            report

 

and youÕre done.  A new web page containing the projectÕs Configuration Data Base has been generated, complete with links to all available documents in the repository.

 

The system is flexible, which results in a fair assortment of options and Òfeatures,Ó and depends on a certain consistency in naming conventions; we have been using essentially the same management system since the late 1970s at CSR/MKI with relatively small tweaks over the years.

Naming Conventions

 

Project File Cabinets and Workspace

 

As a matter of convenience we keep the document repository entirely within a web directory structure.  Assume for the moment that we have several projects, one of which is called ÒCRaTER.Ó  We create the following directories in web space

 

      /projects/crater/

      /projects/crater/file_cabinet/

 

wherein will reside the top level projectÕs engineering web page (beyond the reach of this memo, but just a page of HTML with a lot of links), the web-displayed Configuration Data Base, and all of the controlled documentation past and present.

 

We currently keep all of our project data base working directories under

 

            /nfs/snebulos/h1/database/

 

For example we have a project directory called ÒCRaTER.Ó  Then, by convention, we establish a directory

 

      ../CRaTER/in_basket/

 

and the following links

 

../CRaTER/data/         -> /nfs/snebulos/h1/www/dbout/

      ../CRaTER/file_cabinet/ -> /nfs/snebulos/h1/www/projects/crater/file_cabinet/

 

All the action occurs in the project directory, and in what follows we assume you are in this work space.  All of the (mostly Perl but some C) programs that do the work are located at

 

            /nfs/acis/a1/database/bin

 

and this location needs to be in your PATH, or at least explicitly referenced in any scripts you use.  On line documentation for these programs is located at

 

            http://snebulos.mit.edu/projects/db-software/

 

Project Name and Document Nomenclature

 

As a remnant of old MIL-STD rules for document numbering, each project within CSR/MKI is given a two digit number used as a prefix to all controlled document numbers.  A list of all numbers assigned here over the years is on the engineering web site in the file

 

http://snebulos.mit.edu/projects/NUMBERS.txt

 

If the first record in the data base contains the string ÒProject: ##Ó, that number will be used as the project number in subsequent processing.  The prefix for the CRaTER project, for instance, is Ò32.Ó

 

The assignment of numbers to drawings and documents proceeds in a hierarchical fashion.  The numbering system follows this form: aa-bbccdd.eeff

Drawing size is indicated by prefixing the drawing size letter to the number. A T indicates a text document. A number prefix is also used for the number of sheets, ( e.g. 2B ). Controlled purchased parts are indicated by a > in this column.

Revisions are shown by a letter in the Rev column. Initial releases are designated as A. Pre-released revisions are indicated by a number starting with 01. In the case of documents owned by other organizations, the revision letter is followed by an m, indicating that this is a "mirrored" copy.

The most salient point for the current discussion is that the document number is the primary index (the revision letter is the secondary index) upon which the management software depends both for sorting the data and linking the data base to the actual documents.

 

For the documents to be correctly filed (see below), they must be named in a manner which directly follows from the rules given above.  Using the template given above the actual data files must be named thus:

                        bbccdd_eeff_rZ.suf

where we have discarded the Project ID aa and replaced the periods with underscores.  Z is the revision letter, and suf is the suffix normally attributed to the document type; viz: PDF.

Filing Documents and Drawings

Handing the Source Documentation

 

The configuration manager receives, by whatever means, new documents (or revisions to old ones; they are treated the same) that are placed in the directory

 

      in_basket/

 

and filed by the script

 

            stuffit

 

which is a simple shell script calling

 

            dbputaway

 

The underlying PERL program goes through a number of steps:

á      Checks that the original is a known document type

á      Checks that the name of the document, reflecting its number in the data base, is well formed (as described above)

á      Checks for the presence of a PDF version of the document (this is what the primary number in the data base is linked to)

á      Checks for the existence of the correct directory structure within the file cabinet to file the document; if any directory in the tree is missing, it will be created

á      Checks to see that a file of the same number and revision does not already exist

á      Creates, if necessary, a directory under the main project web page, which will contain all of the related documents.

á      Writes the file non-destructively into the data base and sets permissions to read-only.

If a document does not have a matching PDF version, or a document of the same name already exists in the file cabinet, the program will fail.  One can invoke a Ò-fÓ flag to force execution by skipping those checks.  (In the case of a pre-existing file, it will be removed with a Òrm –fÓ prior to the new file being written.)

 

Updating the Project Data Base

 

Changes to a project data base are made with an ASCII text editor modifying the file

 

      parts.idb

 

In this file each record is composed of a series of fields in the form {name value} that may appear in any order and on the same or multiple lines.  Records are separated from each other by one or more blank lines.  As with fields, the records may occur in any order.  There is no provision for comment lines within this file.

 

The ÒnameÓ of a field is a series of non-whitespace characters, the first of which is, by definition, a separator character; it must be unique and the same for all field names within the file.  Whitespace terminates the Òname.Ó  The Òvalue,Ó whose existence is optional, is composed of all text following the name and ends either at the start of the next Òname,Ó a blank line (which would mark the end of the current record), or the end of the file.  Note that the separator character cannot be escaped and it cannot appear anywhere in the file other than at the beginning of a field name.

 

The first record in parts.idb must contain a comma-separated string of all valid field names (less the separation character):

            Field_names: name1,name2,É,nameN

This is used purely as a typo-check in subsequent processing.

 

Having edited the parts.idb file, we now invoke a shell script

 

            report

 

which first  tests to see if the parts.idb file has a later modification date more recent than the parts.cdb file.  If so, the program

 

dbnormal

 

is invoked to analyze the Òinput-to-the-data-baseÓ parts.idb file for syntax errors according to the rules specified in

 

      parts.sdb

 

Upon success the records are sorted, using as keys the contents of the first two fields in each record, and the Òcompressed-data-baseÓ

 

      parts.cdb

 

is written with each record contained on a single line and excess whitespace removed.  (This file is the master data base source and only the Configuration Manager has write access to it.) Finally, the file

 

      parts.idb

 

is rebuilt according to the template contained in the Òformatted-data-baseÓ file

 

      parts.fdb

 

For logging and security purposes, the parts.cdb file is entered into an RCS file each time it is (re)generated.   The revision number is forced to be of the form YY.MMDD – which implies, of course, that we only maintain the latest revision processed on a given date; but that seems adequate.

 

The building of an arbitrary output file from the data contained in a *.cdb file is handled by

 

            dbprint

 

which applies a template given in an *.fdb file to each record provided on its STDIN.  (Invoke dbprint with a Ò-hÓ flag to get an explanation of the full syntax.)  The template contains arbitrary text plus escaped field names which are replaced on output with the data base values associated with those field names.  There are two important considerations to note when rebuilding parts.idb:

 

á      The first character in the *.cdb file is, by definition, the separator character used to distinguish a field name from any other text; therefore the first character in parts.fdb must be this separator character

á      The first two field names occurring in the *.cdb file are used to sort the contents; therefore the first two field names present in parts.fdb must be chosen appropriately.

 

There is a bug (in the Unix sense) here.  Because the sort preceding the generation of the new *.cdb file happens before the *.idb file is rebuilt, editing an *.idb file in such a way as to change the order of the first two field in any record will cause the sort to do interesting, and usually undesirable, things.  ÒTouchÓing the *.idb file and rerunning report, however, will make things right.

 

Publishing a Project Data Base

 

The Project Data Base is normally published as an HTML document on the web using

 

            dbreport

 

The complexities with which dbreport deals donÕt really have anything to do the HTML formatting – that could have been handled using an appropriate format file together with dbprint – but, rather, with adding useful links to the display.

 

As each record is added to the document, a search is made in web space to see if the record exists – see stuffit above.  In this case a link is added to the field value of the secondary index.  If a displayable form of the record (PDF and TXT as default; file parts.pdb as an alternate source) exists, a link added to the field value of the primary index.  Tab spacing in the ÒplainÓ HTML mode is controlled by the contents of parts.tdb which is passed as a string of command line arguments to tabset which is used in the output pipe.

 

In parallel with generating an HTML display of the data base contents, a tab-separated-value file is generated as well.  This is suitable for importing into spreadsheet programs.

 

In many small ways, dbreport is optimized for dealing with documents having hierarchical primary numbers and revision letters.  Other applications will probably do better by writing their own scripts that invoke dbprint in appropriate inventive ways.