Reusable Filters Proposition

Francis J. Lacoste francis.lacoste at Contre.COM
Tue Jan 15 18:21:18 CET 2002


On Tue, Jan 15, 2002 at 04:33:22PM +0100, Joost van Baal wrote:
[...]
> 
> OK, lets see if I fully understood you.  dns.cfg now is something like:
> 
>  top-requesting-hosts               hosts_to_show=10
>  top-requesting-hosts-by-method     hosts_to_show=10 method='recurs'
>  top-requesting-hosts-by-method     hosts_to_show=10 method='nonrec'
>  top-requested-names                names_to_show=10
>  top-requested-names-by-method      names_to_show=10 method='recurs'
> 
> Clearly, your proposal leads to a much more maintainable system, since
> we can get rid of report definitions like
> dns/reports/top-requesting-hosts-by-method.xml , and keep only
> dns/reports/top-requesting-hosts.xml.

That's right.

> 
> You store this information:
> 
>  <lire:filter-spec>
>   <lire:eq arg1="$resolver" arg2="$method"/>
>  </lire:filter-spec>
> 
> in dns.cfg now ( |filter_eq method="something" ).  

Not really. The idea is to put 

<lire:filter-spec>
  <lire:eq ...>
</lire:filter-spec> 

in it's own XML file. (I would use a reusable-filter-spec element
wrapper with title, description, param-spec and display-spec elements added 
to its content.) 

And we add a filters/<superservice> directory to the <datadir> namespace. 
(We already have reports/<superservice>, schemas/<superservice>.)

In the configuration file, '|' is used to mark that the following ID
should setup filters for the following reports.

filter_eq was what I called "magic" filter. (More on this later).

> And, indeed, we need
> a place to store the description of the filter.  I guess the description
> can be constructed from the description of the variables in lire:field
> in lire:dlf-schema.  Hmm... Does lire:field allow a description?

lire:field allows a description. (We should used to document the content
and purpose of the fields, instead of relying on comments.)

> 
> Description would be something like:
> 
>  Number of Lookups by Hosts
>  after filtering: Resolving method equals recursive

That's a nice idea. (the 'after filtering:') expansion. I think I'll use
that. The filter's definition's  display-spec/title element content 
(which allows expansions) would be the thing to use for this.

> 
> <snip>
> > Filters would be reset whenever we encounter a new filter
> > specification following a report-specification (like in the DNS
> > example above). To reset the filter explicitely, we add the
> > |filter_none "magic" filter.
> > 
> > Since there are many simple filters that are possible and would be
> > potentially useful I suggest to add other "magic" filters which
> > wouldn't need to be explicitely defined in external files. There would
> > be automatically a filter_<op> and filter_not_<op> defined for each
> > schemas.
> 
> I don't really get this.  Which filters do you call 'magic' and which
> are  not 'magic'?  Is filter_eq a magic filter?

Yes, filter_eq was a 'magic' filter. 

A 'magic' filter is a filter which isn't defined through a XML
definition. The idea was for the configuration parser to define
on-the-fly the filter based on the filter's pseudo-id (filter_<op>)
where parameter name's would be field and parameter's value the thing to
match against.

For example,

|filter_eq method='recurs' type='AAAA'

would be equivalent to:

<lire:filter-spec>
 <lire:and>
   <lire:eq arg1="$method" arg2="recurs"/>
   <lire:eq arg1="$type"   arg2="AAAA"/>
 </lire:and>
</lire:filter-spec>

Apply mutatis mutandis to all other operations (filter_re, filter_ge,
etc.)

But I think we should drop the 'magic' filter concept altogether.
Altough like I said, it would be convenient: no need to write and
install an XML file to use a filter, there are too much problems which
I think makes it a wrong idea.

Problems:

What about or operations? 
What about operation which have more than two parameters (i.e. match).
How do you generate automatically sensible descriptions for the filters.

Like I said, practically, the number of useful filters is limited. I
think we should define those with appropriate descriptions through the
use of the XML definitions.

The only 'magic' filter, I would use is |filter_none to reset the
filters. (i.e. don't filter for next reports).

> 
> <snip>
> 
> Currently, the order of the subreports in the .cfg file decides for the
> order in the final report.  Is the actual computation done in the same
> order?  This might lead to problems after a while...  (I fear the work
> needed to make this more flexible, though...)

Computation is done in an entirely different order (for the parrallel
algorithm). Reports which shares the same filters (not as specified 
in the configuration files, but really once parameter expansions is
completed) are computed together. In the sequential algorithm, each
reports is computed in the same pass (filtering, computation, write).

It is to allow this flexibility in the output order vs computation order
that the API for the operation available was splitted in 

init_report()
update_report()
end_report()
write_report()

To write the report in the good order, you only have to call
write_report() on the report specifications in the good order. This way
you are free to call the other operation in any order (well, sort of:
init_ has to be called before update_ which has to be called before
end_).

> 
> > 2) Including informations about applied filters in the generated
> >    report.
> > 
> > We have to find a way to include informations about the applied
> > filters in the generated reports. For example, in the above DNS
> > example, we have to include informations about the record subsets on
> > which the reports were computed since they will have all the same
> > title. Maybe we should generate a section header whenever a filter was
> > applied? Something like : "Recursive Requests Reports". Maybe we could
> > use the display-spec element in the filter specification for that
> > purpose.
> <snip>
> 
> See above.

Like I said, I think the best things to use is display-spec/title of the
filter XML definitions, and drop the 'magic' filters concept.

> 
> On Sat, Jan 05, 2002 at 03:51:28PM -0500, Francis J. Lacoste wrote:
> > On Tue, Jan 01, 2002 at 12:28:54PM +0100, Egon Willighagen wrote:
> > > If we make the syntax of this configuration file more complex, isn't it
> > > time to move over to XML?
> >
> > I thought about this. I agree that the configuration file should
> > eventually be a XML file for extensibility and clarity.
> >
> > So until we have a good configuration infrastructure in place, I vote
> > that we keep the single/simple configuration file scheme.
> 
> I even doubt wether the configuration file should be XML ever.  A
> configuration file is there to offer easy hooks.  If people want
> unlimited flexibility, they should hack in Lire itself.  The current
> configuration file syntax allows for enough flexibility, I guess.
> Writing a userfriendly interface to the *current* configuration file is
> higher on my personal wishlist then improving flexibility.
> 
> This is related to the simplicity vs exhaustivity thing: once we offer a
> very exhaustive system, it will be next to impossible to write a
> userfriendly configuration interface for it.
> 

This is really interesting idea! I think I somewhat agree with it.
(Altough I don't think that the last sentence is correct. I would say:
"once we offer a very exhaustive system, it will be harder to
write a userfriendly configuration interface for it").

Thanks for the comments! 

-- 
Francis J. Lacoste
francis at Contre.COM
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
Url : http://lists.logreport.org/pipermail/development/attachments/20020115/022275e2/attachment.bin 


More information about the Development mailing list