how to implement a lire dlf and report archive

Joost van Baal joostvb at logreport.org
Sun May 27 18:22:28 CEST 2001


Hi,

I'm thinking about how to implement a lire dlf and report archive.  (One
might even call it a datawarehouse, for extra buzzword bingo fun.)

The current ideas, after some discussion between me and Egon, are in
the TODO file.  (You can see the latest one on 
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/logreport/service/doc/TODO?rev=1.146&content-type=text/plain ).

The ideas are still somewhat unpolished.  If you have some ideas about
it, please give them.  I'm planning to start implementing the ideas the coming
week.  The current blurb in the TODO file about the archive is:

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

The archive should store files in .xml and .dlf format.  It shouls reside
somewhere under /var/lib/lire/data.  (The current lire .deb creates·
/var/lib/lire .)

The variable KEEP still is used to decide wether tmpfiles are kept.  The to
be introduced variable ARCHIVE indicates wether files should get archived.  If
set, files are moved from TMPDIR to the archive.  So, two kind of files are
in consideration: those which are candidates for archiving (depending on
ARCHIVE or KEEP) and those which will never get stored in the archive (kept
in TMPDIR depending on KEEP)

file  is      variable variable   file is
candidate       KEEP    ARCHIVE   kept in
  for            is      is
archive

   yes           set     set      archive
   yes           set    unset     archive
   yes          unset    set      archive
   yes          unset   unset    /dev/null
   no            set     set       TMPDIR
   no            set    unset      TMPDIR
   no           unset    set      /dev/null
   no           unset   unset     /dev/null


Per kept file, we wanna be able to find out:

   - filename
   - service
   - superservice
   - timerange
   - subject/hostname/fromaddress (maybe even complete mailheaders
       of email message which contained the logfile)
   - some external id (e.g. hostname, to be able to merge different reorts
       which report on the same thing)
   - format (xml, log, report, or maybe even something else)

We use an 'LR_ID' to identify a job for the lire system, i.e. a received
email message or local logfile.

We use a 'REPORT_ID' to identify a report.  One logfile could get split in
parts about e.g. different days.  For each day, a separate report could get
generated.  Other ways to split are possible (e.g. for logfiles which carry
lines about different hosts or even services.)

Perhaps it's wise to include an LR_ID in the generated report.

We could store meta information in an index file (e.g.
/var/lib/lire/data/meta/index), which could look like:

 LR_ID-9871614364-1456 subject gelfand test
 LR_ID-9871614364-1456 service email
 LR_ID-9871614364-1456 time 2001050427

 REPORT_ID-987161443426-234 time 20010527-20010528
 REPORT_ID-98716144999-234 time 200105270104-200105282359
 REPORT_ID-98716144988-261 time 200105

That is: idtag space key space value-with-possibly-embedded-spaces .

Perhaps we should think of some relational database model, and implement it
accordingly.

time ranges should be UTC, in "allmost human readable format":
yyyymm[dd[hh[mm[ss]]]][-yyyymm[dd[hh[mm[ss]]]]]

The directorylayout could be:

                                          subservice (sub)reporttype
 /var/lib/lire/data/report/xml/email/postfix/all/complete/extid/20010527-20010528
 /var/lib/lire/data/report/html/
 /var/lib/lire/data/report/ascii/
 /var/lib/lire/data/email/raw/
 /var/lib/lire/data/email/plain/
 /var/lib/lire/data/log/dlf/www/apache/common/viewtype/extid/200105
                                              ^^^^^^^^

where should different 'views' go?  and filtered logs?  E.g., currently we
have 'filter' and 'filter_messages' for email.  The are filters from dlf to
dlf.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Bye,

Joost

-- 
Joost van Baal              . .           http://www.logreport.org/
                           .   .
/^LogReport$/               . .               joostvb at logreport.org


-- 
To UNSUBSCRIBE, email to development-request at logreport.org with a subject of 
"unsubscribe". Trouble? Send an email with subject "help" to 
development-request at logreport.org



More information about the Development mailing list