Back Table of contents Index Next   BibTool Manual
Reference Manual
Semantic Checks
Regular Expression Checks

Regular Expression Checks

The regular expressions (see section Regular Expression Matching) which are used to rewrite fields (see section Field Rewriting) can also be used to perform semantic checks on fields. For this purpose the resource check.rule is provided. The syntax of check.rule is the same as for rewrite.rule.

check.rule { field # pattern # message}

Again field and message is optional. The separator # can also be written as equality sign (=) or omitted.

Each field is processed as follows. Each check.rule is tried in turn until one rule is found where field (if given) is identical to the field name and pattern matches a substring of the field value. If such a rule is found then the message is written to the error stream. If no message is given then nothing is printed and processing of the current field is ended.

message is treated like the replacement text in rewrite.rule, Thus the special character combinations described in section Field Rewriting are expanded.

Usually the matching is not done case sensitive. This means that any upper case letter matches its lower counterpart and vice versa. This behavior is controlled by the boolean resource check.case.sensitive which is ON by default. Changing this variable influences only rewrite rules as described in section Field Rewriting.

check.case.sensitive = off

Consider the following example. We want to check that the year field contains only years from 1800 to 1999. Additionally we want to allow two digit abbreviations.

check.rule { year "^[\"{]1[89][0-9][0-9][\"}]$" }

check.rule { year "^[\"{][0-9][0-9][\"}]$" }

check.rule { year "" "\@\$: Year has to be a suitable number"}

The first rule matches any number starting with 1 followed by 8 or 9 and finally two digits. The whole number may be enclosed in double quotes or curly braces.1 The hat at the beginning and the dollar at the end force that the pattern matches against the whole field value only.

The next rule covers years consisting of two digits. The first two rules produce no error message but end the search for further matches. Thus is something suitable is found then one of the first two rules finds it.

Otherwise we have to produce an error message. This is done with the third rule. The empty pattern matches against any value of the year field. This rule is only applied if the preceding rules do not match. In this case we print an error message. \@ is replaced by the current type and \$ by the current key.


1 In fact the regular expression allows also strings starting with a quote and ending in a curly brace. But this syntactical nonsense is ruled out by the parser already.



Back Table of contents Index Next   BibTool Manual
Reference Manual
Semantic Checks
Regular Expression Checks
© 1999 Gerd Neugebauer