
Processing options 119
Due to the use of special characters, which represent the bulk insert file (BIF), you must run
Verity Spider with a command file using the
-cmdfile option.
For example, if you want to use a script called fix_bif to add customized information to BIF files,
use the following command:
vspider -cmdfile filename
Where filename is the text-only command file that contains the following (along with any other
necessary options):
-processbif 'fix_bif !*'
Your command file will include other options as well.
-regexp
Specifies the use of regular expressions rather than the default wildcard expressions for the
following options:
-exclude, -indexclude, -include, -indinclude, -skip, -indskip,
-preferred, and -nofollow.
Wildcard expressions allow the use of the asterisk (*) for text strings, and the question mark (?) for
single characters, as the following table shows:
Regular expressions allow for more powerful and flexible matching of alphanumeric strings; for
example, to match "ab11" or "ab34" but not "abcd" or "ab11cd," you could use the following
regular expression:
^ab[0-9][0-9]$
The full extent to which regular expressions can be employed is beyond the scope of this
description. For more information on regular expressions, refer to a book devoted to the subject.
-submitsize
Syntax:
-submitsize num_documents
Specifies the number of documents submitted for indexing at one time. The default value is 128.
The upper limit is 64,000.
Note: Although larger values mean more efficient processing by the indexer, smaller values allow
more parallelism on multi-CPU systems. In the event of a halt during indexing, a smaller value means
fewer documents will be lost.
Wildcard expression Text string
a*t although, attitude, audit
a?t ant, art
file?.htm files.htm, file1.htm, filer.htm
name?.* names.txt, named.blank, names.ext
Commentaires sur ces manuels