Output

The following section describes the output location, the filename format, and the format of the records in the files.

Record Format

Each output file consists of one or more records, where each record contains a subject, a relation, a column, a flag, a date, an optional time, and a value. The subject can be configured to be either the relation's description or a concatenation of the relation and column names. The flag describes the type of change that was applied to the database. This can be either update, insert, delete, or create. A detailed description of each of these is given below. The time field is left blank if the data modified was a daily time series. The record can be output as a pipe delimited file or as XML.

Pipe Delimited Sample Output

The following shows the pipe delimited sample output:

TopRelation:Equities:IBM_TopColumn:Price:Close|IBM|Close|create|20070707||1.300000

XML Sample Output

The following shows the XML sample output:

<?xml version="1.0"?><answerset> <mimdata> <description>INTC</description> <relation>INTC</relation> 
<column>Close</column> <mimrow action="update"> <date>20070810</date> <value>106.0000 104.5000</value>
</mimrow></mimdata>

Update, Insert, Delete and Create

Update, insert, delete, and create are defined as:

  • insert - This flag indicates that this data point is a new value for this time series.

  • update - This flag indicates that this data point is replacing a previously existing time series.

  • delete - This flag indicates that a NaN is replacing the previous non-NaN data point, thereby deleting it.

  • create - This flag indicates that this is a new data point for a new time series.

    The difference between an insert and a create is that an insert is a new data point for an already existing series, whereas create is the first data point for a new time series.

Spool Directory

The spool directory is the location where the publisher writes the output files. The spool directory must be on the same file system as $LIMHOME/pub2/tmp. (This is because the incomplete files are written to the tmp directory, and they are moved automatically from tmp to spool only when the file is complete. The move operation is only automatic if the move is on the same file system.) The base spool directory contains one subdirectory for every subscriber. Within each subscriber directory, there is a full, high, and low directory. And finally, within each of the full, high, and low directories, are where the output files reside. The meaning of the full, high, and low directories is described below.

Example layout:

spool/
spool/spock -- subscriber
spool/spock/full -- all data is published to the full directory.
spool/spock/high -- data on the high bandwidth channel is published here.
spool/spock/low -- data on the low bandwidth channel is published here.

Full, High, Low Directories

The full directory contains files that are published on the full band. This directory will contain every delta for the subscribed data. The high and low directories contain files that are published on the high and low bands, respectively.

See the section called “Throttles and Date Thresholds” to configure the thresholds for the full, high and low directories.

Filename format

As explained in the configuration section, a given subscriber may have more than one cfg file. Each cfg file may subscribe to different sets of data. The output filename helps determine the corresponding cfg file and the time it was published. It has the following format:

<cfg filename prefix>_<date>_<time>.dat

For example, a subscriber named spock with a cfg file called test.cfg would have an output file in $LIMHOME/pub2/spool/spock/full called test_20070901_113035.dat where 20070901_113035 is the date and time that the publisher wrote the file.