can't generate pdf from apache combined log (huge bugreport)

Joost van Baal joostvb at logreport.org
Fri Mar 23 13:39:59 CET 2001


Hi,

I can't generate pdf from a apache combined log.  I believe it should
be possible with current cvs, shouldn't it?  This is what happens:

lire version: current cvs, Fri Mar 23 13:18:43 CET 2001

xml tools versions:

$ dpkg -l xalan jade jadetex docbook-stylesheets docbook
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
||/ Name                     Version                  Description
+++-========================-========================-================================================================
ii  xalan                    1.0-3                    XSLT processor.
ii  jade                     1.2.1-18                 James Clark's DSSSL Engine
ii  jadetex                  3.5-1                    LaTeX macros for SGML to DVI/PS/PDF conversion with Jade
ii  docbook-stylesheets      1.62-3                   Modular DocBook stylesheets, for print and HTML
ii  docbook                  4.1-2                    SGML DTD for software documentation

hibou at gelfand ~/logreport-unstable/etc/lire$ cat defaults.local 
KEEP=1
DEBUG=1
LOGGING=stderr

hibou at gelfand /tmp$ cat apache.log 
gelfand.mdcc.cx - - [22/Mar/2001:08:41:23 +0100] "GET /debian/dists/potato/contrib/source/Release HTTP/1.1" 404 248 "-" "Debian APT-HTTP/1.3"
as-97-30.dial-up.siol.net - - [22/Mar/2001:10:06:39 +0100] "GET /doc/textutils/TODO HTTP/1.1" 200 2679 "http://www.google.com/search?q=crc32+algorithm&btnG=Google+Search&hl=en&lr=&safe=off" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 95)"
dyn126-bwk.nbw.tue.nl - - [22/Mar/2001:12:22:50 +0100] "POST /cgi-bin/ezmlm/subscribe HTTP/1.1" 200 562 "http://lisa.ooo.nl/nieuwsbrief.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
195.254.11.252 - - [22/Mar/2001:13:08:01 +0100] "GET /cgi-bin/manwhatis?1 HTTP/1.1" 200 0 "http://www.google.com/search?q=convert+eps+into+bmp+c%2B%2B+open+source&hl=de&lr=&safe=off&start=30&sa=N" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
chello213047116200.14.vie.surfer.at - - [22/Mar/2001:13:16:27 +0100] "GET /doc/esound-common/html/esdcat-esdmon-esdrec.html HTTP/1.1" 200 2946 "http://www.google.com/search?q=esdmon&sourceid=opera&num=50" "Opera/5.02 (Windows NT 5.0; U)  [en]"
192.51.5.5 - - [22/Mar/2001:13:23:45 +0100] "GET /doc/glibc-doc/libc.html HTTP/1.0" 404 217 "http://www.bos2.alltheweb.com/cgi-bin/advsearch?terms=3&type=all&query=posix+1003.1b+standard&exec=FAST+Search&lang=any&enco=iso-8859-1&A1=%2B&B1=&C1=&A2=%2B&B2=periodic+timer&C2=&A3=-&B3=fixes&C3=&dincl=&dexcl=&hits=100&nooc=on" "Mozilla/4.75 [en] (X11; U; Linux 2.2.17-21mdk i686)"
gelfand.mdcc.cx - - [22/Mar/2001:13:26:33 +0100] "GET /icons/text.gif HTTP/1.0" 200 229 "http://mdcc.cx/~vanbaal/plaatjes/" "Mozilla/4.76 [en] (X11; U; Linux 2.4.0 i686; Nav)"
gelfand.mdcc.cx - - [22/Mar/2001:13:26:37 +0100] "GET /~vanbaal/plaatjes/djjoost at huishoudencentrum-2.jpg HTTP/1.0" 200 27168 "http://mdcc.cx/~vanbaal/plaatjes/" "Mozilla/4.76 [en] (X11; U; Linux 2.4.0 i686; Nav)"
gelfand.mdcc.cx - - [22/Mar/2001:13:26:40 +0100] "GET /~vanbaal/plaatjes/djjoost at huishoudencentrum-3.jpg HTTP/1.0" 200 10624 "http://mdcc.cx/~vanbaal/plaatjes/" "Mozilla/4.76 [en] (X11; U; Linux 2.4.0 i686; Nav)"
txe.epfl.ch - - [22/Mar/2001:13:41:48 +0100] "GET /cgi-bin/man2html?multilog+8 HTTP/1.1" 302 256 "http://mdcc.cx/cgi-bin/manwhatis?8" "Mozilla/4.0 (compatible; MSIE 5.0; Linux) Opera 5.0  [en]"
txe.epfl.ch - - [22/Mar/2001:13:41:50 +0100] "GET /cgi-bin/man2html/usr/share/man/man8/multilog.8.gz HTTP/1.1" 200 8404 "http://mdcc.cx/cgi-bin/manwhatis?8" "Mozilla/4.0 (compatible; MSIE 5.0; Linux) Opera 5.0  [en]"
194.178.84.125 - - [22/Mar/2001:14:47:00 +0100] "POST /cgi-bin/ezmlm/subscribe HTTP/1.0" 200 529 "http://lisa.ooo.nl/nieuwsbrief.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)"

hibou at gelfand /tmp$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/home/hibou/logreport-unstable/bin

hibou at gelfand /tmp$ lr_run lr_log2report -x /tmp/err www apache combined < apache.log > apache.xml
unknown all none lr_log2report info started with -x /tmp/err www apache combined
www apache none lr_log2report info gonna run lr_log2raw www apache combined
unknown all none lr_log2raw info started with www apache combined
....
www all none lr_dlf2xml notice keeping /home/hibou/tmp/lr_dlf2xml.filter_pics.15540.dlf on your request. remove manually.
www all none lr_dlf2xml info stopped
www apache none lr_log2raw info lr_dlf2xml finished
www apache none lr_log2raw notice keeping /home/hibou/tmp/lr_log2raw.apache.15499.dlf on your request. remove manually.
www apache none lr_log2raw info stopped
www apache none lr_log2report info generating raw XML output, not doing lr_xml2ascii
www apache none lr_log2report notice keeping /home/hibou/tmp/lr_log2raw.apache.15499.report.raw on your request. remove manually.
www apache none lr_log2report info stopped
hibou at gelfand /tmp$ 

hibou at gelfand /tmp$ cat apache.xml

<?xml version="1.0"?>
<!DOCTYPE report SYSTEM "/home/hibou/logreport-unstable/lib/xml/dtd/logreport.dtd" []>
<report date="Fri Mar 23 13:19:54 CET 2001">
  <!-- generated by lr_dlf2xml(1) -->
  <subreport superservice="www">
    <title>requests per clienthost, top 10</title>
    <table>
      <entry>
        <value>4</value>
        <name>gelfand.mdcc.cx</name>
      </entry>
      <entry>
        <value>2</value>
        <name>txe.epfl.ch</name>
      </entry>
      <entry>
        <value>1</value>
        <name>dyn126-bwk.nbw.tue.nl</name>
......
      <group count="1">
        <title>194.178.84.125</title>
        <entry>
          <value>1</value>
          <name>/cgi-bin/ezmlm/subscribe</name>
        </entry>
      </group>
      <group count="1">
        <title>192.51.5.5</title>
        <entry>
          <value>1</value>
          <name>/doc/glibc-doc/libc.html</name>
        </entry>
      </group>
    </table>
  </subreport>
</report>


hibou at gelfand /tmp$ xalan -IN apache.xml -XSL ~/logreport-unstable/lib/xml/stylesheet/xsl/logreport/docbook.xsl > apache.dbx


========= Parsing /home/hibou/logreport-unstable/lib/xml/stylesheet/xsl/logreport/docbook.xsl ==========
Parse of /home/hibou/logreport-unstable/lib/xml/stylesheet/xsl/logreport/docbook.xsl took 40 milliseconds
========= Parsing apache.xml ==========
Parse of apache.xml took 200 milliseconds
=============================
Transforming...

transform took 90 milliseconds

Total time took 290 milliseconds

hibou at gelfand /tmp$ cat apache.dbx

<?xml version="1.0" encoding="UTF-8"?>
<article>
    <title>LogReport report</title>
    <section>
        <title>requests per clienthost, top 10</title>
        <table frame="all">
            <title> </title>
            <tgroup cols="2" align="left" colsep="1" rowsep="1">
                <thead>
                    <row>
                        <entry spanname="hspan">Name</entry>
                        <entry>Value</entry>
                    </row>
                </thead>
                <tbody>
                    <row>
                        <entry>gelfand.mdcc.cx</entry>
                        <entry>4</entry>
                    </row>
                    <row>
                        <entry>txe.epfl.ch</entry>
                        <entry>2</entry>
                    </row>
....
                    <row>
                        <entry>
                            <emphasis>192.51.5.5</emphasis>
                        </entry>
                        <entry>1</entry>
                    </row>
                    <row>
                        <entry>/doc/glibc-doc/libc.html</entry>
                        <entry>1</entry>
                    </row>
                </tbody>
            </tgroup>
        </table>
    </section>
</article>



I add the <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN" []>
line, as stated in the README file:


hibou at gelfand /tmp$ head apache.dbx 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN" []>
<article>
    <title>LogReport report</title>



hibou at gelfand /tmp$ jade -E 0 -t tex -d /usr/lib/sgml/stylesheet/dsssl/docbook/nwalsh/print/docbook.dsl apache.dbx


jade:/usr/lib/sgml/catalog:2:8:E: CATALOG entries cause loop
jade:apache.dbx:2:62:W: cannot generate system identifier for public text "-//OASIS//DTD DocBook XML V4.1//EN"
jade:apache.dbx:2:63:E: reference to entity "ARTICLE" for which no system identifier could be generated
jade:apache.dbx:2:0: entity was defined here
jade:apache.dbx:2:63:E: DTD did not contain element declaration for document type name
jade:apache.dbx:3:8:E: element "ARTICLE" undefined
jade:apache.dbx:4:10:E: element "TITLE" undefined
jade:apache.dbx:5:12:E: element "SECTION" undefined
.....

jade:apache.dbx:503:24:E: element "ROW" undefined
jade:apache.dbx:504:30:E: element "ENTRY" undefined
jade:apache.dbx:505:30:E: element "ENTRY" undefined
jade:/usr/lib/sgml/catalog:2:8:E: CATALOG entries cause loop
jade:/usr/lib/sgml/catalog:2:8:E: CATALOG entries cause loop
jade:/usr/lib/sgml/catalog:2:8:E: CATALOG entries cause loop
....

jade:/usr/lib/sgml/catalog:2:8:E: CATALOG entries cause loop
jade:/usr/lib/sgml/catalog:2:8:E: CATALOG entries cause loop
jade:/usr/lib/sgml/stylesheet/dsssl/docbook/nwalsh/print/../common/dbtable.dsl:224:13:E: 2nd argument for primitive "ancestor" of wrong type: "#<unknown object 136266304>" not a single
ton node list
jade:/usr/lib/sgml/stylesheet/dsssl/docbook/nwalsh/print/../common/dbtable.dsl:224:13:E: 2nd argument for primitive "ancestor" of wrong type: "#<unknown object 136266304>" not a single
ton node list
....

hibou at gelfand /tmp$ cat apache.tex

...
{3}\def\ProcessingMode%
{title-sosofo-mode}}requests per clienthost, top 10\endNode{}\endPar{}\endSeq{}        \Node%
{\def\Element%
{3}}\endNode{}
        \Node%
{\def\Element%
{4}}\DisplayGroup%
{\def\StartIndent%
...

The data like `gelfand.mdcc.cx' is not in the TeX file.  I can run

hibou at gelfand /tmp$ pdfjadetex apache.tex

but the generated apache.pdf shows just the titles of the subreports,
not the data itself.

Is my xml toolset broken?  Egon, have you got any clue?
`jade:apache.dbx:2:62:W: cannot generate system identifier for public 
text "-//OASIS//DTD DocBook XML V4.1//EN"' sounds scary.  Do I
need to install any additional tools?

Bye,

-- 
Joost



-- 
To UNSUBSCRIBE, email to development-request at logreport.org with a subject of 
"unsubscribe". Trouble? Send an email with subject "help" to 
development-request at logreport.org



More information about the Development mailing list