Input log written to $TMPDIR

Raymond Page pagerc at ufl.edu
Wed Dec 29 08:48:12 CET 2004


Wytze,

That discussion is really helpful.  Thanks a lot for sending it to 
me.  Perhaps this could be posted on the website as a faq or 
something?

--
Raymond Page

On Wed Dec 29 02:39:32 EST 2004, Wytze van der Raay 
<wytze at NLnet.nl> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Joost van Baal wrote:
> | Hi Raymond,
> |
> | On Tue, Dec 28, 2004 at 12:08:25AM +0100, Raymond Page wrote:
> |
> |>I'm curious why the log that I provide as input to 
> lr_log2report
> |>is written to $TMPDIR.  If the log file is being recorded to a
> |>dlf, can't it just process the log in place and write only to 
> the
> |>temporary dlf store?
> |>
> |>I ask because I'm processing gigabyte sendmail logs.  Having a 
> 1GB
> |>log file written decompressed to tempspace and then written 
> into
> |>the SQLite database seems like somewhere there's extra work 
> being
> |>done which is using a lot of disk space.
> |
> |
> | It seems copying the raw logfile to $TMPDIR/logfile is done in
> | &Lire::LrCommand::handle_logfile .  This was introduced 
> 2004-08-30,
> | after Lire 1.5, with the reimplementation in Perl of lots of 
> small
> | stand-alone utilities (lr_check_prereq, lr_dlf2xml, lr_inflate,
> | lr_log2xml, lr_store, lr_xml2ascii, lr_xml2chart, ...).
> |
> | &Lire::LrCommand::import_log calls Lire::ImportJob, which is 
> passed a
> | file ( pattern => $self->{'_logfile'} ); see 
> Lire::ImportJob(3pm).
> |
> | Hrm...
> |
> | Could you try to give your uncompressed logfile as an argument 
> to
> | lr_log2report, and not pass it via STDIN?  I believe passing 
> just the
> | filename makes Lire skip the copy-to-tmpfile step.
> 
> This reminds of a discussion with Francis Lacoste, the main 
> developer
> for this code, a while ago in some other context. Here is what 
> Francis
> noted at that time (September 2, 2004):
> 
> |>The amount of /tmp space used by this job was about 2.2 GB (the 
> sqlite
> stuff
> |>> I presume).
> |
> | Yes, it could be related to SQLite which will use temporary 
> space to hold
> | the queries result. But be aware also that in Lire 1.5, the log 
> file is
> | replicated several times.
> |
> | 1- Since lr_log2report read its log file from stdin, it saved 
> it in a
> | temporary file. That 350M which stays there until the end of 
> the run.
> |
> | 2- While it is imported into the temporary DlfStore, the Dlf 
> data is
> written
> |     to a temporary file which is read back into the DlfStore. 
> At this
> stage,
> |    we are at 3x350Mg. At the end of that phase, the temporary 
> DLF file
> |    will be removed.
> |
> | 3- The email superservice has one analyser, in Lire 1.5, the 
> extended
> schema
> |     table store the extended field + all the fields of the 
> original
> table. So
> |    we are againg doubling the required space.
> |
> | 4- It means that probably SQLite can account for half or less 
> of the
> temporary
> |     space.
> |
> | How Lire 2.0 improves things:
> |
> | 1) lr_log2report can now takes the log filename has parameter. 
> So no
> need to
> | copy it into a temporary file (unless it is compressed).
> |
> | 2) Converting the old 2dlf script to the new perl module 
> DlfConverter API
> |    would eliminate the temporary DLF file used by the adapter. 
> This is not
> |    part of 2.0
> |
> | 3) The extended data table now only contains the extended data, 
> so
> there is
> |    no doubling of the data for each extended schemas (this will 
> mainly
> benefit
> |   the space requirements of generating www records). A SQL join 
> is now
> used
> |   when necessary.
> 
> Using a non-compressed logfile on the command line (rather than 
> through
> stdin) is apparently the way to minimize the amount of $TMPDIR 
> space
> used by Lire 2.0.
> 
> Regards,
> - -- wytze
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.5 (MingW32)
> Comment: Using GnuPG with Thunderbird - 
> http://enigmail.mozdev.org
> 
> iD8DBQFB0l80qs+zhiEbbu8RApfeAKDqj2PIm0P9GR7NDs4aA6JCADjZ3ACcDGcA
> nawocDrDJ0iGpIDCvsl+kHs=
> =EjkF
> -----END PGP SIGNATURE-----
> 
> 




-- 
To UNSUBSCRIBE, email to questions-request at logreport.org with a subject of 
"unsubscribe". Trouble? Send an email with subject "help" to 
questions-request at logreport.org



More information about the Questions mailing list