Help understanding some Lire conceptual issues
Brad Knowles
blk at skynet.be
Thu Mar 4 15:41:36 CET 2004
At 7:35 AM -0600 2004/03/04, Jim Lancaster wrote:
>> E-mail works as a testing mechanism, but not much else. It
>> absolutely does not scale...
>
> Point taken, but in the documentation I see no other transmission
> mechanism. Did I overlook something?
You can transmit most logs directly via the network through
"syslog". The initial implementation was UDP-based, which can lose a
hell of a lot of data on a busy network. However, modern
implementations of syslog (e.g., syslog-ng, nsyslog, ssyslog, etc...)
can be done over TCP, which won't lose traffic.
Failing that, you can use network-accessible filesystems such as
NFS, AFS/Coda, SMB/Samba, etc....
If that doesn't work, then you can move the files via the network
with tools like ftp, rcp, scp, etc....
Pretty much all of these methods have variants that will work
across a wide variety of OSes, including virtually all *nix
implementations but also including non Unix-like OSes such as those
from Microsoft, MacOS 9 and earlier, etc....
> My clients' firewall logs accumulate at a rate of about 10MB per day,
> per firewall. But they are, by far, the biggest logs we generally deal
> with. I would take it for granted that logs of the size you are talking
> about would need specialized software to process them. What do you use
> to process a log that accumulates at the rate of 1GB/hr? Surely that
> tool would be overkill for processing tape backup logs, for example.
Well, to get really serious to have to use data mining tools.
I've seen tools that could handle log data input on the order of tens
or hundreds of megabytes per second (that's not the actual data about
which we are logging information, but only the information we are
logging about the information we are processing), but then they
spread that across multiple back-end database servers, etc....
In our case, we were looking at processing web proxy logs with
tools like "analog", but we had limited disk space available to us at
the time (even on our most powerful/largest disk space servers), and
it was taking longer than an hour to process 1GB worth of logs, so we
would not have been able to keep up with the data being generated by
our web proxy servers.
BTW, that 1GB/hour rate was from just one old and slow web proxy
server, and later we upgraded to a set of four much faster web proxy
servers. ;-)
Of course, at the time I was working at the largest ISP in
Belgium, so it's not really surprising that we'd have really massive
web proxy logs.
> Do you mean that you would not use 'Lire' in a production environment?
> Or that you would not use e-mail to transmit the logs in a production
> environment?
I wouldn't use e-mail as a transmission method in a production environment.
I would definitely use lire as a log processing system in a
production environment, and I look forward to the day when I can get
rid of all those damn shell scripts for which I have accumulated
maintenance responsibility, and point people at lire instead.
> 1. Limited scope - They can only manage a limited number of log types.
> (e.g., Syslog or Windows event logs only.)
I don't believe that lire does anything with Windows event logs,
at least not yet. The framework to do so is certainly there,
although you'd have to handle the matter of getting the logs in and
back out again.
> 2. Lack of extensibility - It is difficult (if not impossible) to add
> monitoring for new or different log types.
Yup. Run into this more than a few times. My issues in this
area is what resulted in me hacking on various scripts I had run
across, and as the last person to hack on them, I became the default
maintainer.
Not a place I want to continue to be.
> 3. Closed solutions - Except for standard Syslog, they generally
> require the use of matching client agent and server. The client for one
> will not work with the server of another solution.
There are plenty of web log processing systems which work with
multiple different input sources, and I think the same is true for
some of the firewall log processing systems. However, beyond that,
you are absolutely right. And even those tools only get you so far
within their narrow scope.
> 4. Rigid storage structure - The database structure where the log data
> is stored forces logs of all different types to conform to a single
> table definition.
Well, lire has this same problem, to a degree.
> 5. Single-enterprise view of the network - There is no provision for
> multiple customers and/or multiple locations, greatly limiting their use
> to MSPs like me.
Lire doesn't really have any good consolidation tools in this
area, at least not that I know of.
> 6. They require local network connectivity or the use of VPNs. (i.e.,
> Client agents cannot reside on remote subnets or across the Internet
> from the server.)
Copying log files from one place to another can be done with a
wide variety of methods. I don't think that lire is the only tool
that can make use of multiple methods.
> 7. Lack of a web interface
Web & firewall log processing systems generally have web
interfaces, and some of the scripts I have run across for other
subsystems also have a web interface. But there are certainly other
scripts I know of that simply produce a plain text file as output,
and leave it up to you as to what you do with that.
> 8. Poor alerting/notification mechanisms - Most create a notification
> for every event that matches a trigger. This is not satisfactory in a
> world where a small failure may result in the same error being generated
> hundreds-- if not thousands (or in your case, millions)--of times within
> a short period. (e.g., tape drive hardware or SCSI-bus failure.)
That's a real-time network/system monitoring tool, which is
totally different. Lire doesn't have anything in this area -- it is
reporting only.
Try nagios, bigbrother, bigsister, pong, spong, or any of those
sorts of programs. If you want network/system trending, look at
tools like mrtg, rrdtool (plus optional front-ends such as cricket
and orca), etc.... Moreover, none of these systems really have
anything to do with log processing -- by the time the data has hit
the logs, it's not real-time any more.
--
Brad Knowles, <brad.knowles at skynet.be>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
--
To UNSUBSCRIBE, email to questions-request at logreport.org with a subject of
"unsubscribe". Trouble? Send an email with subject "help" to
questions-request at logreport.org
More information about the Questions
mailing list