corresponds to the name defined in the step 1
name---{CONST<:SUBFIELD:>[CONST]}}
- Enter only constants that appear systematically.
- Between two discrete subfields has to be defined a constant of a non zero
length
- "---"is a mandatory separator between the name and the source
field definition
Example of a definition of author(repetitive) and title (non-repetitive)
fields:
=== data source configuration template ===
TI---<:TI:>
AU---<:FIRSTNAME:>-<:SURNAME:>
4.3. Step 3 Definition of target record
This definition describes the layout of the target record that is created by the conversion,
together with the corresponcence to the source fields defined in step 2.
- Create/edit "data target configuration template" section of the configuration file.
- Each line of this section stands for an output line created by the conversion.
- <name> corresponds to the name defined in the steps 1 and 2
CODE---CONST<:name::SUBFIELD::FUNCT():>CONST<:GENERATED_VALUE:>
- CODE stands for a tag for readability (optional)
- "::"is a mandatory separator between the name and the subfield
definition
- optionally, you can apply the appropriate formatting function(s)
and generated values
- "::"is a mandatory separator between the subfield definition and the function(s)
- "---"is a mandatory separator between the tag and the output code definition
- mark repetitive source fields with an asterisk (*)
Example of a definition of author (repetitive) and title (non-repetitive) codes:
AU::CONF(AU,,0)---<datafield id="700" ind1="" ind2=""><subfield code="a"><:AU*::AU:></subfield></datafield>
TI::CONF(TI,,0)---<datafield id="245" ind1="" ind2=""><subfield code="a"><:TI::TI::SUP(SPACE, ):></subfield></datafield>
4.4 Formatting in BibConvert
4.4.1 Definition of formatting functions
Every field can be processed with a variety of functions that
partially or entirely change the original value.
There are three types of functions available that take as element either
single characters, words or the entire value of processed field.
Every function requires a certain number of parameters to be entered
in brackets. If an insufficient number of parameters is present,
the function uses default values. Default values are constructed with attempt to keep the original value.
The configuration of templates is case sensitive.
The following functions are available:
ADD(prefix,suffix) - add prefix/suffix
KB(kb_file,[0-9]) -lookup in kb_file and replace value
ABR(x,suffix)/ABRW(x,suffix) - abbreviation with suffix addition
ABRX() - abbreviate exclusively words longer
CUT(prefix,postfix) - remove substring from side
REP(x,y) - replacement of characters
SUP(type) - suppression of characters of specified type
LIM(n,L/R)/LIMW(str,L/R) - restriction to n letters
WORDS(n,side) - restriction to n words from L/R
MINL(n)/MAXL(n) - replacement of words shorter/greater
than n
MINLW(n) - replacement of short values
EXP(str,1|0)/EXPW(type) - replacement of words from
value if containing spec. type/string
IF(value,valueT,valueF) - replace T/F value
UP/DOWN/CAP/SHAPE/NUM - lower case and upper case, shape
SPLIT(n,h,str,from)/SPLITW(sep,h,str,from) - split
into more lines
CONF(field,value,1/0)/CONFL(value,1/0) - confirm validity
of a field
RANGE(from,to) - confirm only entries in the specified
range
ADD(prefix,postfix)
default: ADD(,) no addition
Adds prefix/postfix to the value, we can use this function to add the proper
field name as a prefix of the value itself:
ADD(WAU=,) prefix for the first author (which may
have been taken from the field AU2)
KB(kb_file) - kb_file search
default: KB(kb_file,1/0/R)
The input value is compared to a kb_file and may be replaced
by another value. In the case that the input value is not recognized, it is by default kept
without any modification. This default can be overridden by _DEFAULT_---default value entry in the kb_file
The file specified in the parameter is a text file representing a table
of values that correspond to each other:
{input_value---output_value}
KB(file,1) searches the exact value passed.
KB(file,0) searches the KB code inside the value passed.
KB(file,2) as 0 but not case sensitive
KB(file,R) replacements are applied on substrings/characters only.
bibconvert look-up value in KB_file in one of following modes:
===========================================================
1 - case sensitive / match (default)
2 - not case sensitive / search
3 - case sensitive / search
4 - not case sensitive / match
5 - case sensitive / search (in KB)
6 - not case sensitive / search (in KB)
7 - case sensitive / search (reciprocal)
8 - not case sensitive / search (reciprocal)
9 - replace by _DEFAULT_ only
R - not case sensitive / search (reciprocal) replace
Edge spaces are not considered.
Output value is not further formated.
ABR(x,trm),ABRW(x,trm) - abbreviate term to x places
with(out) postfix
default: ABR(1,.)
default: ABRW(1,.)
The words in the input value are shortened according to the parameters
specified. By default, only the initial character is kept and the output
value is terminated by a dot.
ABRW takes entire value as one word.
example |
input |
output |
ABR() |
firstname_surname |
f._s. |
ABR(1,) |
firstname_surname |
f_s |
ABR(10,COMMA) |
firstname_surname |
firstname,_surname, |
ABRX() - abbreviate exclusively words longer than given limit
default: ABRX(1,.)
Exclusively words that reach the specified length limit in the input value are abbreviated.
No suffix is appended to the words shorter than specified limit.
CUT(prefix,postfix) - remove substring from side
default: CUT(,)
Remove string from the value (reverse function to the "ADD")
REP(x,y) - replace x with y
default: REP(,) no replacement
The input value is searched for the string specified in the first parameter.
All such strings are replaced with the string specified in the second parameter.
SUP(type,string) - suppress chars of certain
type
default: SUP(,) type not recognized
All groups of characters belonging to the type specified in the first
parameter are suppressed or replaced with a string specified in the second
parameter.
Recognized types:
SPACE .. invisible chars incl. NEWLINE
ALPHA .. alphabetic
NALPHA .. not alphabetic
NUM .. numeric
NNUM .. not numeric
ALNUM .. alphanumeric
NALNUM .. non alphanumeric
LOWER .. lower case
UPPER .. upper case
PUNCT .. punctuation
NPUNCT .. not punctuation
example |
input |
output |
SUP(SPACE,-) |
sep_1999 |
sep-1999 |
SUP(NNUM) |
sep_1999 |
1999 |
SUP(NUM) |
sep_1999 |
sep_ |
LIM(n,side)/LIMW(str,side) - limit to n letters
from L/R
default: LIM(0,)
no change
default: LIMW(,R) no change
Limits the value in order to get the required number of characters by
cutting excess characters from either side.
LIMW removes the Left/Right side to the (str) string.
example |
input |
output |
LIM(4,L) |
sep_1999 |
1999 |
LIM(4,R) |
sep_1999 |
sep_ |
LIMW(_,R) |
sep_1999 |
sep_ |
WORDS(n,side) - limit to n words from L/R
default: WORDS(0,R)
Keeps the number of words specified in the first parameter from either
side.
example |
input |
output |
WORDS(1) |
sep_1999 |
1999 |
WORDS(1,L) |
sep_1999 |
sep_ |
MINL(n) - exp. words shorter than n
default: MINL(1)
All words shorter than the limit specified in the parameter are replaced
fro mthe sentence.
The words with length exactly n are kept.
example |
input |
output |
MINL(2) |
History of Physics |
History of Physics |
MINL(3) |
History of Physics |
History Physics |
MAXL(n) - exp. words longer than n
default: MAXL(0)
All words greater in number of characters than the limit specified in
the parameter are replaced. Words with length exactly n are kept.
example |
input |
output |
MAXL(2) |
History of Physics |
of |
MAXL(3) |
History of Physics |
of |
MINLW(n) - replacement of short values
default: MINLW(1) (no change)
The entire value is deleted if shorter than the specified limit.
This is used for the validation of created records, where we have 20
characters in the header.
The default validation is MINLW(21), i.e. the record entry will not
be consided as valid, unless it contains at least 21 characters including
the header. This default setting can be overriden by the -l command line option.
In order to increase the necessary length of the output line in the configuration
itself, apply the function on the total value:
AU::MINLW(25)---CER <:SYSNO:> AU L <:SURNAME:>,
<:NAME:>
EXP(str,1|0) - exp./aprove word containing specified
string
default: EXP (,0) leave
all value
The record is shortened by replacing words containing the specified
string.
The second parameter states whether the string approves the word (0)
or disables it (1).
for example, to get the email address from the value, use the following
example |
input |
output |
EXP(@,0) |
mail to: libdesk@cern.ch |
libdesk@cern.ch |
EXP(:,1) |
mail to: libdesk@cern.ch |
mail libdesk@cern.ch |
EXP(@) |
mail to: libdesk@cern.ch |
libdesk@cern.ch |
EXPW(type) - exp. word from value if containing spec. type
default: EXPW type
not recognized
The sentence is shortened by replacing words containing specified type
of character.
Types supported in EXPW function:
ALPHA .. alphabetic
NALPHA .. not alphabetic
NUM .. numeric
NNUM .. not numeric
ALNUM .. alphanumeric
NALNUM .. non alphanumeric
LOWER .. lower case
UPPER .. upper case
PUNCT .. punctuation
NPUNCT .. non punctuation
Note: SPACE is not handled as a keyword, since all space characters
are considered as word separators.
example |
input |
output |
EXPW(NNUM) |
sep_1999 |
1999 |
EXPW(NUM) |
sep_1999 |
sep |
IF(value,valueT,valueF) - replace T/F value
default: IF(,,)
Compares the value with the first parameter. In case the result is TRUE,
the input value is replaced with the second parameter, otherwise the input
value is replaced with the third parameter.
In case the input value has to be kept, whatever it is, the keyword
ORIG can be used (usually in the place of the third parameter)
example |
input |
output |
IF(sep_1999,sep) |
sep_1999 |
sep |
IF(oct_1999,oct) |
sep_1999 |
|
IF(oct_1999,oct,ORIG) |
sep_1999 |
oct_1999 |
UP - upper case
Convert all characters to upper case
DOWN - lower case
Convert all characters to lower case
CAP - make capitals
Convert the initial character of each word to upper case
and the rest of characters to lower case
SHAPE - format string
Supresses all invalid spaces
NUM - number
If it contains at least one digit, convert it into a
number by suppressing other characters. Leading zeroes are deleted.
SPLIT(n,h,str,from)
Splits the input value into more lines, where each line contains
at most (n+h+length of str) characters, (n) being the number of characters
following the number of characters in the header, specified in (h). The
header repeats at the beginning of each line. An additional string can
be inserted as a separator between the header and the following value.
This string is specified by the third parameter (str). It is possible to
restrict the application of (str) so it does not appear on the first line
by entering "2" for (from)
SPLITW(sep,h,str,from)
Splits the input value into more lines by replacing the line
separator stated in (sep) with CR/LFs. Also, as in the case of the SPLIT
function, the first (h) characters are taken as a header and repeat at
the beginning of each line. An additional string can be inserted
as a separator between the header and the following value. This string
is specified by the third parameter (str). It is possible to restrict the
application of (str) so it does not appear on the first line by entering
"2" for (from)
CONF(field,value,1/0) - confirm validity of a
field
The input value is taken as it is, or refused depending on
the value of some other field. In case the other (field) contains
the string specified in (value), then the input value is confirmed (1)
or refused (0).
CONFL(str,1|0) - confirm validity of a field
The input value is confirmed if it contains (1)/misses(0)
the specified string (str)
RANGE(from,to) - confirm only entries in the specified
range
Left side function of target template configuration section to select the desired
entries from the repetitive field.
The range can only be continuous.
The entry is confirmed in case its input falls into the range from-to
specified in the parameter, border values included. As an upper limit it
is possibe to use the keyword MAX.
This is useful in case of AU code, where the first entry has a different
definition from other entries:
AU::RANGE(1,1)---CER <:SYSNO:> AU2 L <:AU::SURNAME:>,
<:AU::NAME:> ... takes the first name from the defined
AU field
AU::RANGE(2,MAX)---CER <:SYSNO:> AU L <:AU::SURNAME:>
, <:AU::NAME:> ... takes the the rest of namesfrom
the AU field
DEFP() - default print
The value is printed by default even if it does not contain any variable input from the source file.
4.4.2 Generated values
In the template configurations, values can be either taken from the source
or generated in the process itself. This is mainly useful for evaluating constant values.
Currently, the following date values are generated:
DATE(format,n)
default: DATE(,10)
where n is the number of digits required.
Generates the current date in the form given as a parameter. The format
has to be given according to the ANSI C notation, i.e. the string is composed
out of following components:
%a abbreviated weekday name
%A full weekday name
%b abbreviated month name
%B full month name
%c date and time representation
%d decimal day of month number (01-31)
%H hour (00-23)(12 hour format)
%I hour (01-12)(12 hour format)
%j day of year(001-366)
%m month (01-12)
%M minute (00-59)
%p local equivalent of a.m. or p.m.
%S second (00-59)
%U week number in year (00-53)(starting with
Sunday)
%V week number in year
%w weekday (0-6)(starting with Sunday)
%W week number in year (00-53)(starting with
Monday)
%x local date representation
%X local time representation
%y year (no century prefix)
%Y year (with century prefix)
%Z time zone name
%% %
WEEK(diff)
Enters the two-digit number of the current week (%V) increased
by specified difference.
If the resulting number is negative, the returned value is zero (00).
Values are kept up to 99, three digit values are shortened from the
left.
WEEK(-4) returns 48, if current week is 52
WEEK current
week
SYSNO
Works the same as DATE, however the format of the resulting value is
fixed so it complies with the requirements of further record handling.
The format is 'whhmmss', where:
w current weekday
hh current hour
mm current minute
ss current second
The system number, if generated like this, contains a variable value
changing every second. For the system number is an identifier of the record,
it is needed to ensure it will be unique for the entire record processed.
Unlike the function DATE, which simply generates the value of format given,
SYSNO keeps the value persistent throughout the entire record and excludes collision
with other records that are generated in period of one week with one second granularity.
It is not possible to use the DATE function for generating a system number instead.
The system number is unique in range of one week only, according to
the current definition.
OAI
Inserts OAI identifier incremented by one for earch record
Starting value that is used in the first record in the batch job can be specified on the command line using the -o<starting_value> option.
diff --git a/modules/bibedit/doc/admin/guide.html.wml b/modules/bibedit/doc/admin/guide.html.wml
index d70e8a41e..4c2a4f89e 100644
--- a/modules/bibedit/doc/admin/guide.html.wml
+++ b/modules/bibedit/doc/admin/guide.html.wml
@@ -1,185 +1,187 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="BibEdit Admin Guide" \
navtrail_previous_links="/admin/> > /admin/bibedit/>BibEdit Admin" \
navbar_name="admin" \
navbar_select="bibedit-admin-guide"
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Contents
1. Overview
2. Edit records via Web interface
3. Edit records via command line
4. Delete records via command line
1. Overview
BibEdit enables you to directly manipulate bibliographic data, edit a
single record, do global replacements, and other cataloguing tasks.
2. Edit records via Web interface
Bibliographic Metadata Editor on Web is not implemented yet.
Please use the command-line technique describe below.
3. Edit records via command line
The idea is to download record in XML MARC format, edit it by using
any editor, and upload the changes back. Note that you can edit any
number of records at the same time: for example, you can download all
records written by Qllis, J
, open the file in your
favourite text editor, and change globally the author name to the
proper form Ellis, J
.
You therefore continue as follows:
- Download the record in XML MARC. For example, download record ID 1234:
$ wget -O z.xml 'http://your.site/search.py?recid=1234&of=xm'
or download latest 5,000 public documents written by Qllis, J
:
$ wget -O z.xml 'http://your.site/search.py?p=Qllis%2C+J&f=author&of=xm&rg=5000'
- Edit the metadata as necessary:
$ emacs z.xml
- Upload changes back:
$ bibupload -r z.xml
- See the progress of the treatment of the file via BibSched:
$ bibsched
If you do not want to wait for the next wake-up time of indexing
and formatting daemons, launch them manually now:
$ bibindex
$ bibreformat
$ webcoll
and watch the progress via bibsched
.
After which the record(s) should be fully modified and formatted and
all indexes and collections updated, as necessary.
4. Delete records via command line
Once a record has been uploaded, we prefer not to *destroy* it fully
anymore (i.e. to wipe it out and to reuse its record ID for another
record) for a variety of reasons. For example, some users may have
put this record already into their baskets in the meantime, or the
record might have been already announced by alert emails to the
external world, or the OAI harvestors might have harvested it already,
etc. We usually prefer only to *mark* records as deleted, so that our
record IDs are ensured to stay permanent.
Thus said, the canonical way to delete a record ID 1234 in CDSware
v0.1.x development branch is to download its XML MARC:
$ wget -O z.xml 'http://your.site/search.py?recid=1234&of=xm'
and to mark it as deleted by adding the indicator ``DELETED'' into the
MARC 980 $$c tag:
$ vi z.xml
[...]
<datafield tag="980" ind1="" ind2="">
<subfield code="a">PREPRINT</subfield>
<subfield code="c">DELETED</subfield>
</datafield>
[...]
and upload thusly modified record in the `replace' mode:
$ bibupload -r z.xml
and watch the progress via bibsched
, as mentioned in the
section 3.
This procedure will remove all necessary entries from the words index
space, the collection cache space, etc, so that the record will not be
findable anymore from the search interface by usual means. But, the
record HTML brief and detailed displays will remain untouched, so that
the record will still be shown to the end users as it used to be when
they will access their baskets, or when they access it via direct URL
distributed by the alert engine (search.py?recid=1234).
In some cases this may not be what is wanted. For example you may
want to warn the users that the record has been deleted and hide its
old contents. To do this, just modify the contents of the other MARC
tags as appropriate, for example you can remove everything and leave
only a title warning:
$ cat z.xml
<record>
<controlfield tag="001">1234</controlfield>
<datafield tag="245" ind1="" ind2="">
<subfield code="a">The record has been deleted</subfield>
</datafield>
<datafield tag="980" ind1="" ind2="">
<subfield code="c">DELETED</subfield>
</datafield>
</record>
so that the end users would see a message ``The record has been
deleted'' instead of the usual title, authors, and stuff in their
baskets.
P.S. Note that the ``bibXXx'' tables will keep having entries for the
deleted records. These entries are to be cleaned from time to
time by the BibEdit garbage collector. This GC isn't part of
CDSware yet; moreover in the future we plan to abolish all the
bibXXx tables, so that this won't be necessary anymore.
P.S. If you want to wipe out all the existing bibliographic content of
your site, for example to start uploading the documents from
scratch again, you can launch:
$ /path/to/your/cdsware/bin/dbexec < /path/to/your/cdsware-source/modules/miscutil/sql/tabbibclean.sql
$ /path/to/your/cdsware/bin/webcoll
diff --git a/modules/bibformat/doc/admin/guide.html.wml b/modules/bibformat/doc/admin/guide.html.wml
index c119c7598..a4c9808ff 100644
--- a/modules/bibformat/doc/admin/guide.html.wml
+++ b/modules/bibformat/doc/admin/guide.html.wml
@@ -1,2488 +1,2490 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="BibFormat Admin Guide" \
navtrail_previous_links="/admin/> > /admin/bibformat/>BibFormat Admin" \
navbar_name="admin" \
navbar_select="bibformat-guide"
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Contents
1. Overview
2. Configuring BibFormat
3. Running BibFormat
3.1 From Web interface
3.2 From the command-line interface
4. Detailed Configuration Manual
1. Overview
The BibFormat admin interface enables you to specify how the
bibliographic data is presented to the end user in the search
interface and search results pages. For example, you may specify that
titles should be printed in bold font, the abstract in small italic,
etc. Moreover, the BibFormat is not only a simple bibliographic data
output formatter, but also an automated link
constructor. For example, from the information on journal name
and pages, it may automatically create links to publisher's site based
on some configuration rules.
2. Configuring BibFormat
By default, a simple HTML format based on the most common fields
(title, author, abstract, keywords, fulltext link, etc) is defined.
You certainly want to define your own ouput formats in case you have a
specific metadata structure.
Here is a short guide of what you can configure:
- Behaviours
- Define one or more output BibFormat behaviours. These are then
passed as parameters to the BibFormat modules while executing
formatting.
Example: You can tell BibFormat that is has to enrich the
incoming metadata file by the created format, or that it only has to
print the format out.
- Extraction Rules
- Define how the metadata tags from input are mapped into internal
BibFormat variable names. The variable names can afterwards be used
in formatting and linking rules.
Example: You can tell that 100 $a
field
should be mapped into $100.a
internal variable that you
could use later.
- Link Rules
- Define rules for automated creation of URI links from mapped
internal variables.
Example: You can tell a rule how to create a link to
People database out of the $100.a
internal variable
repesenting author's name. (The $100.a
variable was mapped
in the previous step, see the Extraction Rules.)
- File Formats
- Define file format types based on file extensions. This will be
used when proposing various fulltext services.
Example: You can tell that *.pdf
files will
be treated as PDF files.
- User Defined Functions (UDFs)
- Define your own functions that you can reuse when creating your
own output formats. This enables you to do complex formatting without
ever touching the BibFormat core code.
Example: You can define a function how to match and
extract email addresses out of a text file.
- Formats
- Define the output formats, i.e. how to create the output out of
internal BibFormat variables that were extracted in a previous step.
This is the functionality you would want to configure most of the
time. It may reuse formats, user defined functions, knowledge bases,
etc.
Example: You can tell that authors should be printed in
italic, that if there are more than 10 authors only the first three
should be printed, etc.
- Knowledge Bases (KBs)
- Define one or more knowledge bases that enables you to transform
various forms of input data values into the unique standard form on
the output.
Example: You can tell that Phys Rev D and
Physical Review D are both the same journal and that these
names should be standardized to Phys Rev : D.
- Execution Test
- Enables you to test your formats on your sample data file. Useful
when debugging newly created formats.
To learn more on BibFormat configuration, you can consult the BibFormat Admin Guide.
3. Running BibFormat
3.1. From the Web interface
Run Reformat Records tool.
This tool permits you to update stored formats for bibliographic records.
It should normally be used after configuring BibFormat's
Behaviours and
Formats.
When these are ready, you can choose to rebuild formats for selected
collections or you can manually enter a search query and the web interface
will accomplish all necessary formatting steps.
Example: You can request Photo collections to have their HTML
brief formats rebuilt, or you can reformat all the records written by Ellis.
3.2. From the command-line interface
Consider having an XML MARC data file that is to be uploaded into
the CDSware. (For example, it might have been harvested from other
sources and processed via BibConvert.)
Having configured BibFormat and its default output type behaviour, you
would then run this file throught BibFormat as follows:
$ bibformat < /tmp/sample.xml > /tmp/sample_with_fmt.xml
that would create default HTML formats and would "enrich" the input
XML data file by this format. (You would then continue the upload
procedure by calling successively BibUpload and BibWords.)
Now consider a different situation. You would like to add a new
possible format, say "HTML portfolio" and "HTML captions" in order to
nicely format multiple photographs in one page. Let us suppose that
these two formats are called hp
and hc
and
are already loaded in the collection_format
table.
(TODO: describe how this is done via WebAdmin.) You would then
proceed as follows: firstly, you would prepare the corresponding output behaviours called HP
and HC
(TODO: note the uppercase!) that would not enrich
the input file but that would produce an XML file with only
001
and FMT
tags. (This is in order not to
update the bibliographic information but the formats only.) You would
also prepare corresponding formats
at the same time. Secondly, you would launch the formatting as
follows:
$ bibformat otype=HP,HC < /tmp/sample.xml > /tmp/sample_fmts_only.xml
that should give you an XML file containing only 001 and FMT tags.
Finally, you would upload the formats:
$ bibupload < /tmp/sample_fmts_only.xml
and that's it. The new formats should now appear in WebSearch.
4. Detailed Configuration Manual
What follows is a transcription of an old
FlexElink Configuration Manual v0.3 (2002-07-31). The text suffers
from low HTML quality and missing screen snapshots. The terminology may not be fully up-to-date
at places.
1. - About BibFormat 2
2. - How it works?. 3
3. - A first look at the web configuration
interface. 5
4. - Mapping the input (OAI Extraction Rules) 7
5. - Defining output types: Behaviors 10
6. - Formats 12
7. - Knowledge bases (KBs) 13
8. - User Defined Functions (UDFs) 14
9. - Defining links 16
9.1.
- EXTERNAL link conditions 17
9.2.
- INTERNAL link conditions 17
9.3.
- Example. 18
10. - User management 20
11. - Evaluation Language Reference. 21
1. -
About BibFormat
BibFormat is a piece
of software that is part of the CERN Document Server (CDS, http://www.cds.ch)
and more concretely of the CDS Search module (http://weblib.cern.ch).
Its mission, in few
words, is to provide a flexible mechanism to format the bibliographic records
that are shown as a result of CDS Search user queries allowing the
administrators or users customize the view of them. Besides, it offers the
possibility of using a linking system that can generate automatically all the
links included in the displayed records (fulltext access, electronic journals
reference, etc) reducing considerably maintenance.
To clarify this too
formal definition, we'll try to illustrate the role of BibFormat inside the CDS
Search module by showing the following figure. Please, note that this drawing
is trying to show the main role that BibFormat plays in the CDS structure and
it's quite simplified, but of course the underlying logic is a bit more
complex.
As you can see, when a
user query is received, Weblib determines which records from the database match
it; then it ask BibFormat to format the obtained records. BibFormat looks at
its rule repository and for each record determines which format has to be
taken, applies the format specification and solves the possible links; gives
all this (in a formatted way) back to Weblib and it makes a nice HTML page
including the formatted results given by BibFormat among other info.
The good point in all
this is that anyone that has access to BibFormat rule repository is able to
modify the final appearance of a query result in the CDS Search module without
altering the logic of the search engine.
In order to be able to
modify this BibFormat rule repository, a web configuration interface is
provided. Trough this paper, we'll try to explain (in a friendly way and form
the user point of view) how to access this interface, how it's structured and
how to configure BibFormat trough it to achieve desired results.
2. - How
it works?
We've outlined which is
the role of BibFormat inside the CDS, so it's time now to have an overview of
how it works and how it's organized. We'll try not to be very technical,
however a few explanation about the BibFormat repository and architecture is
needed to understand how it works.
BibFormat, basically,
takes some bibliographic records as input and produces a formatted & linked
version of them as output. By "formatted" we mean that BibFormat can produce an
output containing a transformed version of the input data (normally an HTML
view); the good part is that you can entirely specify the transformation to
apply. At the same time, by "linked" we mean that you can ask BibFormat to
include (if necessary) inside this formatted version references to some
Internet resources that are related to the data from some pre-configured rules.
As an example, we could
imagine that you'd want to see the resulting records from CDS Search queries to
show their title in bold followed by their authors separated by comas. For
achieving this you'll have to go to the BibFormat configuration interface and
define a behavior for BibFormat in which you describe how to format incoming
records:
"<b>"
$title "</b>"
forall($author){
$author separator(", ")
}
Figure 1.-
A very first Evaluation Language example
Don't be scared!! It's a
first approach to the way BibFormat allows you to describe formats. As you can
see, BibFormat uses a special language that you'll have to learn if you want to
be able to specify formats or links; it seems difficult (as much as a
programming language) but you'll see that it's quite more easy than it seems at
first sight.
In the next figure, is
shown how BibFormat works internally. When BibFormat is called, it receives a
set of bibliographic records to format. It separates each record and translates
it into a set of what we call "internal variables"; these "internal variables"
are simply an internal representation of the bibliographic record; the
important thing with them is that they will be available when you have to describe
the formats. Once it has these "internal vars", the processor module looks into
the behavior repository for that one (let's say format) you've asked BibFormat
to apply (when BibFormat is called, you can indicate which of the
pre-configured behaviors to apply; this allows it to have more than one
behavior); inside this behavior you can specify which data you want to appear,
how it has to appear, some links if they exist…in other words, the format
(actually, it's something more than a format, it describes how BibFormat has to
behave for a given input; that's why we refer to it as behavior). As we've already said, you can include links
in a behavior specification; links are a special BibFormat feature that helps
you to reduce the maintenance of your formats: you can include a link in
several formats or behaviors.
The picture below,
describes all this explanation.
Summarizing, BibFormat can
transform an input made up of bibliographic records in an HTML output (not only
HTML but any text-based output) according to certain pre-configured
specifications (behaviors) that you can entirely define using a certain
language.
Just to mention, currently
BibFormat is working taking OAI MARC XML as format for input records, but it
can be adapted to other ways of inputs (reading a database, function call, etc)
with a little of development.
3.
- A first look at the web configuration interface
BibFormat can be
configured through its configuration interface that is accessible via web. It's
made up of a bunch of web pages that present you the main configuration aspects
of BibFormat allowing you to change them. In this section we are going to have
a first look at this web interface, how it's structured and its correspondence
with BibFormat features.
Before entering
these web pages you'll be asked for your accessing username & password.
Only certain users are allowed to access BibFormat WI; first you need a CDS
account that you can create easily by using the standard CDS account manager;
then you have to ask BibFormat administrator to give privileges to access the
WI.
Once your password is accepted you'll access
the configuration interface. You'll see that is quite simple: It's structured
in different sections; each of them corresponds to a BibFormat feature and you
can navigate through them by using a navigation bar that is always present on
the left.
Here you are a list
of the different sections the interface offers you and their correspondence
with BibFormat features:
·
Behaviors: This is the main section, the one you
enter by default when you access the web interface. It contains definitions for
the different pre-configured output types or behaviors that allow you to define
how you want BibFormat to behave when each output type is selected. More information
in chapter Defining output types: Behaviors of this manual.
·
OAI Extraction Rules: The input types and
mapping rules for OAI MARC XML inputs are defined here. You'll find here the
information about all the internal variables and their correspondence with the
input XML tags. See chapter Mapping the input of this manual for more
information.
·
Link Rules: Allows you to access the link rules
repository for defining the way links are generated. See chapter Defining
Links for a more detailed description about the BibFormat linking system.
·
UDFs: Presents you a list of all the User
Defined Functions (UDFs) that you can use inside Evaluation Language (EL)
statements that are used for specifying different configuration aspects.
You'll also be able to modify or extend this list within this section.
Everything about using UDFs and defining new ones in chapter User Defined
Functions (UDFs).
·
Formats: Another EL feature: You can
define a certain piece of EL code under a name for re-using it whenever
you want. See chapter Formats.
·
KBs: A complete management interface for Knowledge
Bases (KBs); those KBs will also be available inside EL
statements. See chapter Knowledge Bases(Kbs) for more specific
information.
·
Execution Test: You'll be able to execute
BibFormat from this section and view the results and some debug info in a web
page. You have to specify an input data file (through a URL).
·
User management: Allows you to define which CDS
users can access or not the BibFormat web interface.
Each section has
different particularities but the way of dealing with them follows a common
line through the interface. However, each section with their common things and
particular characteristics are treated in the following chapters of this
manual.
4. - Mapping the input (OAI Extraction Rules)
We have already spoken a bit about BibFormat internal
variables. These are a key point to understand the BibFormat way of
working. As you know, BibFormat takes some bibliographic records as input and,
according to some pre-configured behavior, formats them into HTML, for example.
The problem is that this input records can come in several formats: different
XML conventions, database records, etc. For now, at CDS we only consider that
the input comes in OAI MARC XML but for the near future we'll may be have to
extend it to accept other input formats.
That's the reason why internal variables appear; they
provide a common way to refer to input data without relaying in any concrete
format. In other words, we will define BibFormat links and behaviors referring
to these internal variables and we'll have some rules that define how to
map an input format to them, so we would be able to use any BibFormat defined
behavior with any input that can be mapped to internal variables.
You shouldn't worry about this because is more in the
development/administration side, but it's important to know where internal
variables come from and what they refer to. Besides, for CDS we only
consider the incoming data in OAI MARC XML format, so we'll talk only about
this case.
Internal variables are quite a simple concept: It's just
a label that represents some values from the input. Besides, a variable can
have fields that are also labels that represent values from the input
but that are related to other under the variable (e.g. You can have a variable
that maps authors and another that maps authors home institutes independently;
but if you want to have represent an author and his home institute you need to
relate these two variables in some way). Variables and their fields also
support multiple values.
Focusing on OAI MARC XML, the concept of variable and field is
already in the input structure.:
·
Each occurrence of OAI MARC XML varfield element
will correspond to a different variable value.
·
Each occurrence of OAI MARC XML subfield inside
a certain varfield element will correspond to a different field value of
the variable that maps the varfield.
So what we will have in BibFormat is a set of rules that tells a
variable name to which varfield element corresponds and each variable
field name which subfield element maps. Trough the web interface you'll
be able to add or delete new fields to variables or variables themselves,
you'll be able even to modify the mapping tags of variables (this way you can
keep your formats independent of changes in the meaning of MARC tags).
In the web interface, all this is located in OAI Ext. Rules
section as you can see in the following figure:
Let's illustrate how BibFormat maps a certain input to variables
and fields with an example:
We have this variable & field definition on BibFormat:
Var. label
|
Mapping tag
|
Mult. V.
|
Fields
|
100
|
<varfield
id="100" i1="" i2="">
|
Yes
|
Field label
|
Mapping tag
|
a
|
<subfield
label="a">
|
e
|
<subfield
label="e">
|
|
909C0
|
<varfield
id="909" i1="C" i2="0">
|
No
|
Field label
|
Mapping tag
|
b
|
<subfield
label="b">
|
|
And then a record like the following arrives as input:
<oai_marc>
<varfield id="037"
i1="" i2="">
<subfield
label="a">SCAN-0009119</subfield>
</varfield>
<varfield id="100"
i1="" i2="">
<subfield
label="a">Racah, Giulio</subfield>
</varfield>
<varfield id="100"
i1="" i2="">
<subfield
label="a">Guignard, G</subfield>
<subfield
label="e">editor</subfield>
</varfield>
<varfield id="909"
i1="C" i2="0">
<subfield
label="b">11</subfield>
</varfield>
<varfield id="909"
i1="C" i2="0">
<subfield
label="b">12</subfield>
</varfield>
</oai_marc>
The result of the mapping would be like this:
Variable
"100"
|
Value#
0
|
|
Field "a" value
|
Racah,
Giulio
|
Value#
1
|
|
Field "a" value
|
Guignard,
G
|
Field "e" value
|
editor
|
Variable
"909C0"
|
Value#
0
|
|
Field "b" value
|
12
|
Notice how varfield 037 is not considered because there
isn't an entry in the BibFormat configuration. Also notice how the values are
created: if "allow multiple values" is set to "Yes" each occurrence of a varfield
element determines a new value (variable "100"); in other case, the last value
is taken as single value for the variable (variable "909C0").
5. - Defining output types: Behaviors
Now that we already know how internal variables are structured
and what they represent in the input, it's time to have a look at how to
configure BibFormat to transform that input data mapped into variables into
HTML results (although any text-based output could be generated).
When BibFormat is asked to format a bunch of bibliographic
records, it is also necessary to specify which output type it has to
use. This output type is a string that identifies a pre-configured set of
conditions and actions that tells BibFormat how to behave with the given input
data (that's why the terms output type and behavior are used
indifferently along this document).
BibFormat can have several pre-configured behaviors each one
identified by a different label. There are two different types of behaviors
(you can choose the behavior type when you define it):
1.
Normal ŕ
Consists in a behavior that outputs exactly the result of its evaluation.
2.
Input Erich (only for XML inputs)ŕ It echoes each xml record
from the input inserting the behavior result just before the xml closing
element of the record.
Each behavior contains an ordered list of conditions; a
condition can contain zero or more associated actions (actions are ordered
inside a condition). A condition is a behavior item described by an Evaluation
Language expression that gives as result "TRUE" or "FALSE". An action is an
Evaluation Language (EL) statement that produces any output.
When BibFormat is called to format a set of input records with a
given behavior label, it looks for the behavior conditions. It evaluates their EL
in order and when one of them produces "TRUE" as result, it looks for their
associated actions. Then BibFormat evaluates the actions in the specified order
and concatenates their result.
By using different conditions you can specify alternative
formats inside a behavior (imagine that you want to format a record differently
depending on its base number); it's true that you could also reach this
solution by using EL IF statements, but it's more clear, efficient and
re-usable (you can change one condition without touching the rest or you can
give it more priority than others, that means give it the chance to be
evaluated before others, by changing its apply order).
Actions are used for specifying the format itself or the actions
you want to carry on with in case the condition is accomplished.
Through the web interface you can define new output types or
modify the ones that already exist. The use is quite easy: you just have to
select the link in the desired item with the operation you want to do over it.
Let's have a look at a simple example to illustrate how to
define behavior that fit our needs:
Imagine a typical case where you want to format bibliographic
records but depending on their base number you want to apply different formats.
Whenever a record from base 27 (standards) arrives we want only to show its
title and the standard numbers, in other case a default format will be applied
in which the title and authors are shown. We'll assume CDS variable notation
and that the input rules are defined properly.
We are going to define a new NORMAL behavior for this new
situation, let's call it SIMPLE. In it we'll need two conditions to be
defined: one for applying the default format and another one for the 27-base
special one. The base number comes in variable 909C0.b, so the conditions would
be based on this variable content.
The result behavior should be defined like this:
SIMPLE
(NORMAL)
|
10
|
$909C0.b="27"
|
"<b>"$245.a"</b>"
forall($0248.a){
rep_prefix(" - ")
$0248.a separator("; ")
}
|
50
|
""=""
|
"<b>"$245.a"</b>"
forall($100.a){
rep_prefix(" - Authors:
") $100.a separator(";
")
}
|
Some explanations on this example are needed:
·
As you can see we have defined two conditions: one for
the 27-format and another for the default format. The point that is important
is the order in which we put the conditions: For each record in the input the
special one is evaluated first (because it has a lower evaluation number, 10)
and if the condition is true the format will be applied; in case the base is
not 27 the default condition is evaluated and because its condition EL code
is always true the default will be used to format the record.
·
Don't worry too much about the action code because it's
quite trivial. There are some "strange" things like the use of functions rep_prefix
and separator. These are special UDFs that have a special
behavior inside a FORALL statement:
o
rep_prefix ŕ Prints the string argument only when we are in the
first iteration of a FORALL. In order words, put the prefix of the
string which is to be generated by the FORALL statement.
o
Separator ŕ Prints the string argument in every FORALL
iteration but not in the last one.
6. - Formats
Formats are a special construction that BibFormat Evaluation
Language (EL) offers. It allows you to group under an identifier
some EL code and after you can call it from every EL statement.
You can manage these formats using the web interface. It is
quite easy to do so: When you access the Formats section it will present
you a list with all the format identifiers that are already defined and a small
documentation about what's the format for. From there you can see the whole EL
code by using the link [Code]. You can add a new format by using the
set of input boxes that you'll find at the end of the page. Also delete and
modify operations are possible for already defined formats.
Note: When defining formats,
one has to pay attention not to use "recursive" format calls (either direct or
indirect); this can lead to execution problems. For example, imagine that we
have a format called "ex 1" that has a call for itself:
Format "ex_1"
|
"hello world"
format("ex_1")
|
…this is a "direct" recursive call; you
should never have these kind of calls as the web interface should warn you if
it finds these kind of troubles. However, "indirect" calls are not detected by
the web interface, so you have to care about them. One example of "indirect"
recursion:
Format "ex_1"
|
"hello world"
format("ex_2")
|
7. - Knowledge bases (KBs)
This is yet another special feature provided by BibFormat Evaluation
Language. In a few words, this allows you to map one string value to
another according to a pre-stored set of key values that map to other values
(the knowledge bases). All the knowledge bases are identified by a label that
has to be unique (among other KBs identifiers); remember that identifiers are
not case-sensitive.
These sets of values, normally lived in a file, but with this
new development there was the need to have an easy KB management that was
integrated in BibFormat. For this reason, you can manage KBs from the BibFormat
configuration interface: section KBs.
When accessing to KBs section, the list of all the KBs
identifiers defined will be displayed. Below it you'll find a set of controls
to add new KBs; the use of these controls is as usual along the interface but
there's something a bit special: Normally, you shouldn't fill in the input box
that asks you for the Knowledge base table name; all the knowledge base
data is handled by a database in which each KB corresponds to a DB table; this
input box gets the internal table name for that KB; normally the KB manager
will generate it for you so you shouldn't need to use it.
Each KB has a link for accessing the list of values that it
contains. If you click on it, a new window will show you the list of current
values (key and mapped ones) and a very easy interface to add new values or to
delete existing ones (KB values are case sensitive).
8. - User Defined Functions (UDFs)
The use of User Defined Functions (UDFs) is one of
the more powerful features of BibFormat Evaluation Language (EL).
The idea is that inside EL you can use operations or functions over
strings; normally a large number of different string transformations are needed
when talking about formatting but we cannot pretend implement all this
operations inside EL because it's in constant growing and new needs
appear all the time. For dealing with this problem, BibFormat defines a
mechanism that allows you to use define as much functions (UDFs) as you
want and use them inside any EL statement.
These functions are identified by a unique name and they receive
data (over which they do operations) by parameters. These functions are defined
in a programming language (PHP) and therefore good knowledge of this language
is needed.
BibFormat offers a complete UDF management through the UDFs
web interface section. There you'll see a complete list of all defined UDFs
with their identifier, parameters and a small documentation about what the UDF
does. You can also add, delete or modify UDFs or even have a look at the
PHP code of an already defined function (there you'll be able to launch small
tests over the defined functions).
The definition of these functions should be reserved to
administrators and some particularities have to be taken into account when
defining UDFs:
·
When you want to add or modify a UDF you are
asked for the parameter list; you have to enter the parameter names separated
by comas. Ex: You want to define a new function for prefixing a given string
with another, so you need two parameters (one for the string which is going to
be prefixed, let's name it str, and another one for the prefix itself,
let's name it prefix); you should enter them in the parameter input box
like this: prefix, str
·
The order in which you specify the parameters when
defining a function is the order in which they have to be passed to the UDF
from an EL statement.
·
When defining the PHP code of a function, there are
some important things to consider:
o
The result of a function has to be a string.
o
The parameters are available inside the PHP code as
variables with the parameter name.
o
The result of the function has to be defined by a PHP
result clause giving the resulting string.
o
Make sure the PHP code is correct (there's no way to
know if the code is correct from BibFormat and it won't tell you if it is).
o
There are some special variables available inside the
PHP definition:
§
$FIRST_ITERATION ŕ Is equal to "1" when we are in the first iteration
of an EL FORALL statement. "0" in other case. If the call is made outside
a FORALL is set to "1".
§
$LAST_ITERATION ŕ Just the opposite case.
With these two
variables you can define FORALL special functions like a function to
print a separator.
9. - Defining links
As we've already said, BibFormat is not only a formatter but it
also provides a link manager but, what do we mean by ‘link manager'? The idea
is to have a set of rules that describe how to generate a link using certain
data; if the link can be generated from those rules, then the link manager can
check different things (i.e. see if the link is valid, if it's a link to a file
it can check if the file exists and in which formats it exists, etc) and
finally return the solved link. In other words, if you have a set of
bibliographic records that can contain a certain link and that link can be
coded in the link manager rules, you don't need to store each link in each
bibliographic record, you just use the link manager to generate them
dynamically; like this, you only have to maintain a small set of rules and not
thousands of static links in records.
BibFormat allows you to configure different link definitions
each of them identified by a unique name; each of these link definitions
have some associated parameters which are the information passed to the
rules defined for it. Then, when you call the link manager to solve a link
(from an EL statement, for example) you'll have to specify the
identifier of the link definition you want to be used and the value for
each of the parameters used by that link definition (always string
values). The link manager will retrieve the rules associated to the link
definition specified and will interpret those rules using the given
parameter values, informing you if the link was generated correctly and result
(the solved link).
BibFormat provides this mechanism and through the web interface
you can access to the rule repository for having a look at what are the
available link definitions, define new link rules or maintain already
defined ones. When adding or modifying a link definition you'll have to
specify the parameters, please remember to separate them by using comas.
Link definitions are structurally quite similar to
behaviors: Although there can be different types of them (as we'll see later),
a link definition is made up of one or more conditions and each of these
conditions can have one or more actions that tell how the link has to be built
in case its condition is accomplished. In general, link rules (this includes
conditions and actions) have a particular structure and they are described in Evaluation
Language (EL) with one restriction: EL LINK statement
cannot be used. Each group of conditions-actions of a link definition can be of
a different solving type (actually, when you create a new link
definition, its solving type its asked; this is just because all
conditions that will be created for that link definition will have the selected
solving type as default; but you can change it afterwards having a
"mixed" link definition). Their structure and way the link manager interprets
them will depend in their solving type. Currently, there you can define
link conditions of two different solving types: EXTERNAL or INTERNAL. A
more detailed explanation about each type is given later.
As we've said a link definition is made up of various link
conditions. When a solving for a concrete link definition is asked, the
link manager retrieves all link conditions associated to it. Then it takes the
first of them (following the evaluation order - the lower is the
evaluation order number, the first the condition is considered), it evaluates
its EL code with the parameter values passed and if the result is "TRUE"
associated actions are executed, the link is returned and the solving process
finishes. In case a condition fails, it looks for the next one. If all the
conditions fail then the link manager returns that the link couldn't be solved.
This is the general behavior of the link manager, but the way of determining if
a link has been solved or not and the link building depends on the condition solving
type.
9.1. - EXTERNAL link conditions
This is the simplest way of solving links. It's intended to be
used when you want to generate a link that points to an external resource
(normally a web page). In this case the link condition is composed by only one
action that will be evaluated if the associated condition is "TRUE". When a
condition of this type is evaluated "TRUE" and the action is executed, the
result of the action is given as the solved link and the link manager finishes.
9.2. - INTERNAL link conditions
This condition solving type is intended to be used when you want
to link to a document which is a file (inside or outside your file system) and
that can be in different file formats.
This case is a bit more complex than the previous one, so we'll
go step-by-step explaining differences and special features:
·
An INTERNAL condition has a base file path and a
base URL associated. The base file path is the string that will
be used as prefix when looking for a file generated by the actions associated
to that condition. On the other hand, the base URL will be a string to
which the link string (resulting from the actions) will be added (i.e. if the base
file path of a condition is /tmp/docs
and the base URL is http://doc.cern.ch/,
if the condition is true and the result of the actions is test.pdf, the file path the link manager
will have to check will be /tmp/docs/test.pdf
and, if the file exists, the generated link will be http:/doc.cern.ch/test.pdf)
·
Any condition of this type can several associated file
formats. This is a new concept that is only used for INTERNAL condition
solving. A file format is simply a set of file extensions that are
grouped under an identifier. Then, you can associate a file format
identifier with a link condition. When the condition is true the link manager
will combine each result from the condition actions with the associated file
formats to check the existence of a file of any format; this means that when an
action is evaluated, the link manager takes the file extensions of each
associated file format identifier and checks if the file base path +
resulting action string + file extension exists in the file system.
·
One condition of this type can have more than one
associated action. Each of its actions describes an alternative way of building
the file path. When a condition of this type is evaluated to "TRUE", the link
manager retrieves its actions (following actions apply order) and
evaluates the first one; with the action result it builds the file path in this
way: file base path + resulting action string, and then combines this
string with each of the file extensions. If any of the combination
exists in the file system, the link is generated (if there are more than one
file format combination that exist, the link variable will have multiple values
containing the different links); if not, it starts the same process with the
next action. If any of the actions drive to a existing file, the link is not
generated.
·
When calling the link manager from a EL
statement (see chapter Evaluation Language Reference), if the link is
solved we'll be able to access to a special internal variable that contains as
value the resulting link. In the INTERNAL condition links, we have said that
this variable can contain multiple values in case the link manager finds
different file formats. In this case, there's another extension that consists
in having some special variable fields containing special values for each value
in the LINK variable and to which you can access when the link is solved;
here's a table detailing the different variable LINK fields which are defined
when a INTERNAL condition link is solved:
Field
name
|
Value
that contains
|
url
|
The same value as the LINK variable: The solved
URL.
|
file
|
Contains the local full path to the file the
solved URL points to.
|
format_id
|
Contains the file format id string
|
format_desc
|
Contains the file format description string (this
is defined for each file format)
|
9.3. - Example
As the link generation is quite a complex topic (specially when
talking about INTERNAL linking) we'll try to illustrate it with a simple
example.
Let's imagine we want to create a new link definition for
generating full-text access to the documents that are archived on a document
server (a file system which contains document's electronic versions). These
documents are organized systematically depending in three characteristics that
are included in the bibliographic records: BASE, CATEGORY and ID. When the base
corresponds to "CERNREP" then the files are archived below directory /pub/www/home/cernrep/
and can be stored following two different criteria that depend on the CATEGORY
and ID values; the documents are all HTML. However, if the base is "PREPRINT"
and the CATEGORY is either "HEP-TH" or "HEP-PH" they are stored under directory
/archive/electronic|/pub/www/home/ following a certain criteria; in this
case the documents can be in several file formats: PDF, Postscript, MS Word.
Of course, we want only the link to be created if the files
corresponding to the bibliographic records exist.
So we start creating a new link definition that we'll call FULLTEXT.
It will receive three parameters that are the information we need for
generating this kind of links: BASE, CATEGORY and ID. We select INTERNAL as
solving type as default and then we fill it the base file path and url with
some default values (these values are not important, they will be copied by
default to the conditions we are going to create afterwards).
Then we create a condition for the first possibility: when BASE
is "CERNREP". We select INTERNAL as link condition because we want to link to a
file and we want to check its existence and we fill in the base file path and
URL with the corresponding values. Then we assign the file format types and we
enter the file archiving criteria as different actions.
For the other possibility we proceed in the same way by adapting
the definition to the requirements; we'll have something like this as result:
Once we have finished the link definition, we can insert links
of this type from a BibFormat behavior, for example. Let's imagine we have
included a piece of EL code like this in a behavior because we want to
insert a link to the full-text documents of any record:
link("FULLTEXT",
$base, $category, $id)
{
"Fulltext: "
forall($link){
"<a href=\"" $link.url "\">"
$link.format_desc "</a>"
separator " - "
}
}
This EL statement will include the string "Fulltext: "
followed by a link to all the documents found for the values of internal
variables $base, $category, $id separated by " - ".
10. - User management
The BibFormat web interface (WI) comes with a security mechanism
which allows you to define which users can access the WI. BibFormat doesn't
have a user management incorporated; instead it uses CDS user schema (as is a
part of CDS). So if you are not registered as CDS user and you want to have
access to BibFormat WI, first thing to do is to register in CDS through the
standard procedure (for example via the CDS Search interface you can access the
CDs account management system).
BibFormat WI access policy is rather simple: it keeps a list of
CDS users that can access the WI. Then if someone tries to access any part of
the WI, the system will ask the user to identify him as CDS user. If the CDS
login is successful and the user is in BibFormat's access list, then the user
will gain access to the WI.
There's a section in the WI which allows you to define which CDS
users will have access to the WI. The use is rather simple: You can add CDS
users to the access list by specifying either their CDS user id or their CDS
login; then you can delete a CDS uses from the access list by simply selecting
the link "delete" for the corresponding user.
When you install BibFormat for the first time and you access to
the WI you'll see that no login or password is asked. The security mechanism
doesn't get activate until at least one user is added to the BibFormat's access
list. So if you don't want to limit the access to BibFormat WI keep the access
list without any user in.
11. - Evaluation Language Reference
In this section we'll present a more or less formal definition
of the Evaluation Language (EL); although we are using some formal
methods to describe it we'll also make a quick explanation about the elements
that made up the language and how to combine them to arrive to desired results.
Just below you can find the EL definition, expressed in terms of
EBNF (Extended Backus-Naur Form) notation. We have used capital letters
to express non-terminal elements and non-capital/bold characters for the
terminal ones. There's one remark to make: Whenever you find the mark [REX] after
any definition, it means that we have used a regular expression just before in
order to express a set of non-terminals.
SENTENCE ::= TERM {&&
TERM | || TERM}
TERM ::= FACTOR {= FACTOR
| != FACTOR | FACTOR}
FACTOR ::= [!]
BASIC
BASIC ::= VARIABLE |
LITERAL | FUNCTION | ( SENTENCE ) | FORALL |
IF | FORMAT | LINK | COUNT
| KB
VARIABLE ::= $
STRING [. STRING]
LITERAL ::= "([^"] | \")*"
[REX]
FUNCTION ::= STRING (
[ SENTENCE {, SENTENCE} ] )
FORALL ::= forall ( VARIABLE
[, LITERAL] ) { SENTENCE }
IF ::= if( SENTENCE
) { SENTENCE } [else { SENTENCE }]
FORMAT ::= format( SENTENCE
)
LINK ::= link( SENTENCE
, [SENTENCE {, SENTENCE}] ) { SENTENCE }
[else { SENTENCE }]
COUNT ::= count( VARIABLE
)
KB ::= kb( SENTENCE
)
STRING ::= [a-zA-Z0-9_] [REX]
This is just a formal way of describing the language, but don't
worry if you don't understand it very well because just below these lines we'll
try to describe it in a more informal way.
To begin with, you should know that EL is a language designed to
work with strings (a string is a collection of characters) but it has also some
logic and comparison operations. One important thing you have to be aware of is
that in EL blank spaces, tabulators or carriage returns have no more meaning
than separator for elements of the language; that means that between two basic
elements you can have as many spaces or carriage returns as you want.
One of the basic elements of the language is what we call LITERALS.
These things represent constant string values; they are delimited by a pair of
double quote (") symbols surrounding the string you want to express. Everything
you put inside the double quotes will be considered as it is, so inside a
literal several spaces or carriage have meaning (it's the only case). If you
want to express a double quote symbol inside a literal you have to escape
it using \.
Some examples of literals:
·
If you want to represent the string hello,
inside the EL you'll have to use "hello".
·
For the string hello "big" man, the representation in EL is "hello \"big\" man"
(notice the escape characters and that spaces have meaning).
·
Let's see \"" string has to be expressed in this
way "Let's see \\\"\"".
Another important basic element of the language is VARIABLES.
These elements represent string data from the input to which you can refer
inside of the language (and is considered also as a string). Variables are defined
in advance by the administrator (or even users) so you have to know which of
them you have access to. Additionally, variables can contain FIELDS that
are simply other input values that are grouped under a variable because they
have some kind of relationship between them (for example, you could have a
variable for the information about the author and fields like name, born place,
etc for it). If you want to know more about variables and their correspondence
with the input you can look at the Mapping the Input section. The way of
expressing a variable in EL is by a dollar symbol followed by any letter,
number or underscore; variables are case-insensitive. To refer to any field of
a variable, you simply put a dot followed by the field name (which is also made
up of any character, number or underscore).
Some examples about variables and fields:
·
Imagine you have a variable which contains the author
information and which is called author, to represent in EL you would
have to write $author. In every place
that $author appears BibFormat will consider the value defined for it from the
current input record.
·
Then you know that the field name of variable author
contains the author full name and you want to refer to it inside an EL
statement, so you'd write $author.name.
·
If we speak about CDS configuration, variable and field
names correspond to MARC 21 tag & indicator names; so to refer to the main
title of a bibliographic record we should use variable 245 field b,
in EL terms: $245.b.
Now that we know basic elements of the language we can start
thinking about how to combine them. The most important (and unique) string
operation is concatenation: adding strings. This operation is implicit to the
language, so we just put language elements one before another, and the representation
result will be the result of the basic elements one after another.
Some samples:
·
To represent the constant string Author:
followed by the name of the author of the input record you should write "Author: " $100.a (it's supposed CDS
configuration in which MARC 21 notation is used; authors correspond to variable
100 field a).
·
You want to output the title in bold (always HTML
speaking) followed by the author in normal chars separated of the title by char
/: "<b>" $245.b "</b>/" $100.a
These two, literals and variables, are only basic elements of
the EL. You can combine them using concatenation to get new strings. But, of
course, there are some more operations you can apply over strings: UDFs (User
Defined Functions). We'll also name these elements as functions, because
they are that: functions or operations to be applied over strings; when talking
about strings we include basic elements or resulting string from applying any
operations. A UDF has a name that identifies it uniquely and needs to get some information
that we call parameters. A UDF gives another string as result depending
on the parameter values (always strings). So to represent a function in EL you
need its name followed by an open parenthesis, the parameter values separated
by comas and a closing parenthesis. There's a list of UDF you can look at
through the interface but this list can be extended to fit your needs (look at UDFs
section of this manual).
Some examples:
·
You want to ensure that the title of a bibliographic
record is always going to be in capital letters; good, there's a function
called upper that takes one parameter and gives as result the parameter
transformed in capital letters. You have to write the call like this: upper($245.b).
·
You want only the 3 first chars of an author name to
appear in capital letters. We've seen there's a function for uppercasing a
string but there's another one, called copy that gets a sub string from
a string passed as first parameter from the char position indicated by the 2nd
parameter and with the length given by the 3rd one: copy(
upper($100.a), "0", "3").
As you can see, these UDFs are very powerful because you can
concatenate their result with another element (literal, variable or even
function) and the parameters can be basic elements or expressions. We can
extend this ensuring that any element or expression of the EL that gives as
result a string value can be combined with other EL expressions or elements.
Another very useful feature of EL is the possibility to use KWONLEDGE
BASES (KBs). A KB is just a set of key values that map (one-to-one) another
set of values; may be knowledge bases isn't a very appropriate name because
they are more like translation tables. BibFormat offers tools to create and
maintain KBs that can be used in the EL afterwards (see chapter KBs
management in this manual). You can see KB invocation as a special function
(the syntax for calling it is the same) with name kb and that takes two
parameters: one for indicating the KB name (BibFormat can handle several KBs)
and another one for the key value to translate. The result is the mapped KB
value or an empty string if it doesn't exist as a key value in the specified
KB. A typical example is when you have months with numbers and you want to
translate them into month names; you could have a KB that maps all the month
numbers to month names and then call it like this kb("MONTH", $m).
Now let's move to FORMATS. Formats are some EL
code which is grouped under a label (a name) and that can be used in any other
EL statement. BibFormat allows the user to define as many formats as he wants
and identify each of them with a simple name. In few words, formats allow you
to reuse EL code; within a format you can put any EL code (even other format
calls) and all the variable values are completely available. Again, a format call in EL follows the same
convention as functions: the word format followed by the format name (a
string) between parenthesis. When you call a format is like if the EL code
define inside that format was pasted, as it is in the place you make the call.
Example: Imagine you have to write the title of a bibliographic
records with a certain format, let's say in bold and red; but this formatted
title you are going to use it in several places. So can take advantages of EL
formats and define a format called TITLE that contains the code "<font color=\"red\"><b>" $245.b
"</b></font>". Once this is done, you could use it to format
records by printing their title in that way and their author after it: format("TITLE") "/" $100.a. The good thing
is that if some day you decide to change the title formatting you'd only need
to modify the TITLE format definition and not all the places where you
show the title.
At this point, you have seen basic elements and operations with
EL. You may think that is powerful enough to express your formatting work, but
there are more complex situations that you'll have to face. We have tried to
design the EL to be easy enough but with the next advanced structures,
sometimes, can arrive to be a bit complex.
All these basic elements and operations are quite OK. But there
are sometimes where you want to compare expressions and decide what to do
depending on the result of the comparison. For this purpose, EL has an IF
statement and a few comparison and logic operators built in (don't forget that
any functionality needed can be achieved by defining new UDFs; EL gives basic
operations to provide this possibility). Let's go step by step: First let's
talk about the set of operators that can be used in a comparison:
1.
Comparison operators: Equal and non-equal (=, !=). They take
two operators that have to be strings and produce a logic (true or false)
value.
2.
Logical operators: AND, OR and NOT (&&, ||, !). All of
them have to be used over logical values, taking two operators AND and OR, and
one operator NOT.
All of them are right associative (except NOT which is unary
left-associative) and their precedence goes like this (more to less): NOT,
(EQUAL, NON-EQUAL), (AND, OR). These operators cannot be used anywhere, only
inside statements that expect a logic value as result, in other words, inside
condition statements.
The IF structure is quite easy to learn: First we indicate the
word IF followed by a condition statement surrounded by parenthesis;
then a EL statement into braces can be specified, this statement will be
executed only if the condition was true; optionally, we can add an ELSE
word followed by another EL statement into braces, that will only be triggered
if the IF condition was not true.
Let's have a look at some examples:
·
I want the title of a record to appear followed by the
constant Author: and its author afterwards. But it could be nice if the
constant string appeared only if the record has author:
format("TITLE") if($100.a!="") { "Author: " $100.a
}
BibFormat is not only an EL processor. Among others, it contains
a link solver that contains it's own rule repository in order to be able to
automatically solve links (see chapter Link solver of this manual). EL has one
special structure for asking the link solver for some links and including them
in the formatted version of the bibliographic record. This way links are easy
to maintain (you modify the rules independently from where the link is being
used) and as re-usable as formats or UDFs. Links are identified by a label and
need some information to be passed as parameters; then an EL statement has to
be specified which will be effective only if the link is solved and inside
which, you'll have access to an special variable, named LINK, which
contains the solved link among other information (see chapter Link solver for
more information about which values are accessible); additionally, an else
statement can be added (following the same syntax as in the IF construction)
that will be effective only if the link can't be solved by the Link solver.
Example:
·
We are with our typical example of the simple format
that contains the title and the author, but now we want the author to be linked
to the search. Supposing that a this kind of link is already defined under the
label "AUTHOR_SEARCH" we should proceed like this:
format("TITLE") "/"
link("AUTHOR_SEARCH", $100.a)
{ "<a href=\""$link "\">"$100.a"</a>"}
The next step when talking about EL components is to deal with
multiple values. Life is no so easy and, of course, and a bibliographic record
can have more than one author or can have a related document which is in more
than one format and that has to be linked. In other words, BibFormat supports
having variables and fields with multiple values (see chapter Mapping input),
consequently a way of applying an EL statement over all the values of a
variable or a field would be quite useful. FORALL is our construction!!
It allows you to specify a variable or a field followed by a EL statement
(between braces) that will be applied for every value of the variable or the
field; any reference to the iteration variable inside the FORALL EL
statement will be related to the current iteration variable value (if you refer
to a variable that has multiple values outside a FORALL the first value
is considered). One limitation is that you shouldn't nest FORALL
statements, in other words, never put a FORALL inside another one. This
construction let's you also limit the number of times you want to iterate over
a variable or field by adding a literal with the number of iterations.
Some examples:
·
Let's continue refining our simple format; now we have
to consider that there can be more than one author for one bibliographic
record, so we want to show all of them with the link included, of course.
format("TITLE") "/"
forall($100.a)
{
link("AUTHOR_SEARCH", $100.a)
{ "<a href=\""$link "\">"$100.a"</a>"}
}
·
Although this FORALL construction could seem not
very useful, it's used a lot when defining formats or behaviors. Quite often
you will have the case where you want only some EL piece of code to be
effective if a certain variable or field exist; FORALL can also be used
in that situation and it has to be said that is the most comfortable way of
doing it. Imagine the case you want the title, the constant string "Author: "
followed by the authors of a bibliographic record; but you don't want the
constant "Author: " to appear if there's no author at all. You could use
something like this:
format("TITLE") " - "
forall($100.a)
{
rep_prefix("Author: ") $100.a " "
}
As you can see we are
using a new function: rep_prefix. In fact this is an UDF which prints
the string passed as parameter only once at the beginning inside a FORALL
statement. But the interesting thing here is the FORALL application.
Finally, there's still one EL special function: COUNT.
Due to certain special situations or strange input data in the variables,
sometimes is useful to know how many values contain a variable or a field. So
this function, simply takes a variable or field as argument and returns a
string with the number of values that contains; if the value returned is 0,
that means that no value is in the variable, what means that variable doesn't
exist or there weren't any values mapped from the input.
Examples:
·
As this is the last example, let's do it a bit more
complicated: Continuing with our very well known simple format, we want all the
authors of the record appear if there are less than 10, in any other case we
want only the first one to appear followed by the string "et al.". We'll also
use a function called GT which returns a non-empty string if the first
parameter is greater than the second one.
format("TITLE") "/"
if(gt(count($100.a),
"10")!="")
{ $100.a "et al." }
else
{
forall($100.a)
{
link("AUTHOR_SEARCH", $100.a)
{ "<a href=\""$link "\">"$100.a"</a>"}
}
}
diff --git a/modules/bibharvest/doc/admin/guide.html.wml b/modules/bibharvest/doc/admin/guide.html.wml
index 2ee12d859..3fcbb17f5 100644
--- a/modules/bibharvest/doc/admin/guide.html.wml
+++ b/modules/bibharvest/doc/admin/guide.html.wml
@@ -1,84 +1,86 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="BibHarvest Admin Guide" \
navtrail_previous_links="/admin/> > /admin/bibharvest/>BibHarvest Admin" \
navbar_name="admin" \
navbar_select="bibharvest-admin-guide"
This Admin Guide is not yet completed. Moreover, some
admin-level functionality for this module exists only in the form of
manual recipes. We are in the process of developing both the
guide as well as the web admin interface. If you are interested
in seeing some specific things implemented with high priority,
please contact us at . Thanks for your interest!
|
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Contents
1. Overview
2. OAI Data Harvesting
2.1 BibHarvest command-line tool
2.2 Periodical harvesting
3. OAI Data Providing
1. Overview
FIXME.
2. OAI Data Harvesting
2.1. BibHarvest command-line tool
To harvest records from an OAI compliant repository, run the
bibharvest
command-line tool. For example:
$ bibharvest -vListRecords -f2004-04-01 -u2004-04-02 -pmarcxml -o/tmp/z.xml \\
http://cdsweb.cern.ch/oai2d.py
For further help with the command-line harvesting tool, run
bibharvest --help
.
2.2. Periodical harvesting
It is not currently possible to set up periodical execution of
bibharvest
. You would have to set up an external cron
job script to do that.
3. OAI Data Providing
FIXME. (See config.wml for OAI tags.)
diff --git a/modules/bibindex/doc/admin/guide.html.wml b/modules/bibindex/doc/admin/guide.html.wml
index 372a07a6d..7ca5bd249 100644
--- a/modules/bibindex/doc/admin/guide.html.wml
+++ b/modules/bibindex/doc/admin/guide.html.wml
@@ -1,71 +1,73 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="BibIndex Admin Guide" \
navtrail_previous_links="/admin/> > /admin/bibindex/>BibIndex Admin" \
navbar_name="admin" \
navbar_select="bibindex-admin-guide"
BibIndex Admin Guide is not yet completed. Most of admin-level
functionality for BibIndex exists only in commandline mode. We
are in the process of developing both the guide as well as the
web admin interface. If you are interested in seeing some
specific things implemented with high priority, please contact us
at . Thanks for your interest!
|
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Contents
1.Overview
2. Configure Metadata Tags and Fields
2.1 Configure Physical MARC Tags
2.2 Configure Logical Fields
3. Configure Word/Phrase Indexes
3.1 Define New Index
3.2 Configure Word-Breaking Procedure
3.3 Configure Stopwords List
3.4 Configure Accent Stripping
4. Run BibIndex Daemon
1. Overview
2. Configure Metadata Tags and Fields
2.1 Configure Physical MARC Tags
2.2 Configure Logical Fields
3. Configure Word/Phrase Indexes
3.1 Define New Index
3.2 Configure Word-Breaking Procedure
3.3 Configure Stopwords List
3.4 Configure Accent Stripping
4. Run BibIndex Daemon
diff --git a/modules/bibrank/doc/admin/guide.html.wml b/modules/bibrank/doc/admin/guide.html.wml
index e5930a588..e7ff277aa 100644
--- a/modules/bibrank/doc/admin/guide.html.wml
+++ b/modules/bibrank/doc/admin/guide.html.wml
@@ -1,377 +1,379 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="BibRank Admin Guide" \
navtrail_previous_links="/admin/> > /admin/bibrank/>BibRank Admin" \
navbar_name="admin" \
navbar_select="bibrank-admin-guide"
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Contents
1.Overview
2.Configuration Conventions
3.BibRank Admin Interface
3.1.Main interface
3.2.Add rank method
3.3.Show details of rank method
3.4.Modify rank method
3.5.Delete rank method
3.6.Modify translations
3.7.Modify visibility toward collections
4.BibRank Daemon
4.1.Command Line Interface
4.2.Using BibRank
5.bibrankgkb Tool
5.1.Command Line Interface
5.2.Using bibrankgkb
6.Additional Information
1. Overview
The bibrank module consist currently of two tools:
bibrank - Generates star categories for ranking searchresults based on methods like:
Journal Impact Factor
##Number of downloads
##Author Impact
##Citation Impact
bibrankgkb - For generating knowledgebase files for use with bibrank
The bibrankgkb may not be necessary to use, it depends on which ranking methods you are planning
to use, and what data you already got. This guide will take you through the necessary steps in detail in order to create different kinds of ranking methods for the search engine to use.
2. Configuration Conventions
- comment line starts with '#' sign in the first column
- each section in a configuration file is declared inside '[' ']' signs
- values in knowledgebasefiles are separated by '---'
3. BibRank Admin Interface
The bibrank webinterface enables you to modify the configuration of most aspects of BibRank. For full functionality, it is advised to
let the http-daemon have write/read access to your cdsware/etc/bibrank directory. If this is not wanted, you have to edit the configuration files from the console using your favourite text editor.
3.1 Main interface
In the main interface screen, you see a list of all rank methods currently added. If you have added the 'long name' translation in the current chosen language for a rank method, you will see this name, if not, and the default cdsware language translation exists, it will be used instead. And if no translation exists, the bibrank code will be used. To find out about the functionality available, check out the topics below.
Explanation of concepts
Rank method:
A method responsible for creating the necessary data to rank a result.
Translations:
Each rank method may have many names in many languages.
Collections:
Which collections the rank method should be visible in.
3.2 Add rank method
When pressing the link in the upper right corner from the main interface, you will see the interface for adding a new rank method. The two available options that needs to be decided upon, are the bibrank code and the template to use, both values can be changed later. The bibrank code is used by the bibrank daemon to run the method, and should be fairly short without spaces. Which template you are using, decides how the ranking will be done, and must before used, be changed to suit your cdsware configuration. When confirming to add a new rank method, it will be added to the list of possible rank methods, and a configuration file will be created if the httpd user has proper rights to the 'cdsware/etc/bibrank' directory. If not, the file has to manually be created with the name 'bibrankcode.cfg' where bibrankcode is the same as given in the interface.
3.3 Show details of rank method
This interface gives you an overview of the current status of the rank method, and gives direct access to the various interfaces for changing the configuration.
In the overview section, you see the bibrank code, for use with the bibrank daemon, and the date for the last run of the rank method.
In the rank set section you see how many records there are in each star category, and the threshold value deciding the range of each category. In the collection part, the collections which the rank method is visible to is shown. The translations part shows the various translations in the languages available in cdsware. On the bottom the configuration file is shown, if accessible.
3.4 Modify rank method
This interface gives access to modify the bibrank code given when creating the rank method and the configuration file of the rank method, if the file can be accessed. If not, it may not exist, or the httpd user doesn't have enough rights to read the file. On the bottom of the interface, it is possible to choose a template, see it, and copy it over the old rank method configuration if wanted. Remember that the values present in the template is an example, and must be changed where necessary. See this documentation for information about this, and the 'BibRank Internals' link below for additional information.
3.5 Delete rank method
If it is necessary to delete a rank method, some precautions must be taken since the configuration of the method will be lost. When deleting a rank method, the configuration file will also be deleted ('cdsware/etc/bibrank/bibrankcode.cfg' where bibrankcode is the code of the rank method) if accessible to the httpd user. If not, the file can be deleted manually from console. Any bibrank tasks scheduled to run the deleted rank method must be modified or deleted manually.
3.6 Modify translations
If you want to use internalization of the rank method names, you have to add them using the 'Modify translations' interface. The interface shows a list of the various name types like 'long name' and 'short name' with the 'long name' initially selected. Below a list of all the languages used in the cdsware installation will be shown with the possibility to add the translation for each language.
3.7 Modify visibility toward collections
If a rank method should be visible to the users of the cdsware search interface, it must be enabled for one or several collections. A rank method can be visible in the search interface of the whole site, or just one collection. The collections in the upper listbox does not show the rank method in the search interface to the user. To change this select the wanted collection and press 'Enable' to enable the rank method for this collection. The collections that the method has been activated for, is shown in the lower listbox. To remove a collection, select it and press the 'Disable' button to remove it from the list of collections which the rank method is enabled for.
4. BibRank Daemon
The bibrank daemon read the necessary metadata from the cdsware database and combines the read metadata
in different ways to output the records ranked into the number of categories (stars) given.
4.1 Command Line Interface
Usage: %s [options]
Examples:
%s --id=0-30000,30001-860000 --run=jif --verbose=9
%s --modified='2002-10-27 13:57:26' --run=jif
%s --rebalance --collection=Articles --run=jif
Ranking options:
-c, --collection=c1,c2 Collections to include in this rank method
if not given, the collections the method is
enabled for will be used.
-i, --id=idr1,idr2 Record ranges to include in this rank method
-m, --modified=[from] Update records modified after date
-k, --check=value Check if the rank method needs rebalancing, (if the top
star is higher than given percentage 0-1.0)
-S, --stat Show statistics
-w, --run=rm1,rm2 Runs each rank method in the order given
-r, --rebalance Rebalance, do full update
Scheduling options:
-u, --user=USER user name to store task, password needed
-s, --sleeptime=SLEEP time after which to repeat tasks (no)
e.g.: 1s, 30m, 24h, 7d
-t, --time=TIME moment for the task to be active (now)
e.g.: +15s, 5m, 3h , 2002-10-27 13:57:26
General options:
-h, --help print this help and exit
-V, --version print version and exit
-v, --verbose=LEVEL verbose level (from 0 to 9, default 1
4.2 Using BibRank
Step 1 - Adding the rank option to the search interface
To be able to add the needed ranking data to the database, you first have to add the rank method to the database, and
add the wished abbreviation you want to use together with it. The name of the configuration file in the next section, needs to
have the same name as the abbreviation stored in the database.
Step 2 - Get necessary external data (ex. jif values)
Check out bibrankgkb documentation below.
Example
jif.kb -- sample data with the name of the journals and jif values.
Step 3 - Create the configuration file
The configuration files for the different rank methods has different option, so verify that you are using the correct
configuration file and rank method.
Example
jif.cfg -- sample configuration file, for creating the ranking stars based on journal impact factor
Single_tag_rank_method:
[rank_method]
##The function which is responsible for doing the work, must be one of the listed ones above.
function = single_tag_rank_method
##How big the top star category should be of all available records. Remember that if a lot of records
##have the same rank value, the size may go above this limit
top_star_percentage = 0.10
##The importance of this rank method if several methods are merged into one rank method.
overall_importance = 1.0
##This section must be available if the single_tag_rank_method is going to be used
[single_tag_kb]
##The tag which got the value to be searched for on the left side in the kb file (like the journal name)
tag = 909C4p
##The path to the kb file which got the content of the tag above on left side, and value on the left side
kb_src = /log/cdsware-DEMODEV/etc/bibrank/jif.kb
##Tags that must be included for a record to be added to a star category, to disable remove tags
check_mandatory_tags = 909C4c,909C4v,909C4y
##For single_tag_rank_method, this needs to be 'yes', depends on the rank method, what it needs of data
enable_modified = yes
For other functions than the single_tag_rank_method, you may need different configuration files, which will be added here when
supported by CDSware.
Step 4 - Add the ranking method as a scheduled task
When the configuration is okay, you can add the bibrank daemon to the task scheduler using the scheduling options. The daemon can then do a update of the rank method once each day or similar automatically.
Example
$ bibrank -wjif -r
Task #53 was successfully scheduled for execution.
Step 5 - Full update, rebalancing
For the first run of a new ranking method, a full update is needed (not default) to establish the ranges to be used for the categories.
A full update/rebalance can be run by using the --rebalance/-r option. Sometimes you may want to run the program with the rebalance option,
to balance the categories. To check if it is necessary, run the bibrank daemon using the --check/-k option together with the max size allowed for the top star , a message will then be given on screen if a rebalance is needed.
Example
$ bibrank 53
2004-03-09 14:28:47 --> Task #53 started.
2004-03-09 14:28:47 --> Running: Journal Impact Factor.
2004-03-09 14:28:47 --> Statistics: Journal Impact Factor , Top Star size: 10.0% , Overall Importance: 100.0%,
2004-03-09 14:28:47 --> 0 star(s): Range>= -9.9 7990
2004-03-09 14:28:47 --> 1 star(s): Range>= -1.0 1
2004-03-09 14:28:47 --> 2 star(s): Range>= 0.964 2
2004-03-09 14:28:47 --> 3 star(s): Range>= 2.047 0
2004-03-09 14:28:47 --> 4 star(s): Range>= 3.13 2
2004-03-09 14:28:47 --> 5 star(s): Range>= 4.213 6
2004-03-09 14:28:47 --> Total: 8001
Step 6 - Fast update of modified records
If you just want to update the latest additions or modified records, you may want to do a faster update by running the daemon without the rebalance option.
If you don't mention anything, the daemon will try to update the records modified after the last run. If you want to update records modified after a certain
time, you can do this with the '--modified=date' option.
5. bibrankgkb Tool
Before the bibrank daemon can be used, a knowledgebase file (kb) with the needed data in the correct format
needs to be created. This file can be created using the bibrankgkb tool which can read the data either from
the cdsware database, from several webpages using regular expressions, or from another file. In case one source
has another naming convention, bibrank can convert between them using a convert file.
5.1 Command Line Interface
Usage: bibrankgkb %s [options]
Examples:
bibrankgkb --input=bibrankgkb.cfg --output=test.kb
bibrankgkb -otest.cfg -v9
bibrankgkb
Generate options:
-i, --input=file input file, default from /etc/bibrank/bibrankgkb.cfg
-o, --output=file output file, will be placed in current folder
General options:
-h, --help print this help and exit
-V, --version print version and exit
-v, --verbose=LEVEL verbose level (from 0 to 9, default 1)
5.2 Using bibrankgkb
Step 1 - Find sources
Since some of the data used for ranking purposes is not freely available, it cannot be bundled with CDSware. To get hold of the necessary data,
you may find it useful to ask your library if they have a copy of the data that can be used (like the Journal Impact Factors from the Science Citation Index), or use google to search the web for any public source.
Step 2 - Create configuration file
The default configuration file is shown below.
##The main section
[bibrankgkb]
##The url to a webpage with the data to be read, does not need to have the same name as this one, but if there
are several links, the url should end with _0->
url_0 = http://www.taelinke.land.ru/impact_A.html
url_1 = http://www.taelinke.land.ru/impact_B.html
url_2 = http://www.taelinke.land.ru/impact_C.html
url_3 = http://www.taelinke.land.ru/impact_DE.html
url_4 = http://www.taelinke.land.ru/impact_FH.html
url_5 = http://www.taelinke.land.ru/impact_I.html
url_6 = http://www.taelinke.land.ru/impact_J.html
url_7 = http://www.taelinke.land.ru/impact_KN.html
url_8 = http://www.taelinke.land.ru/impact_QQ.html
url_9 = http://www.taelinke.land.ru/impact_RZ.html
##The regular expression for the url mentioned should be given here
url_regexp =
##The various sources that can be read in, can either be a file, webpage or from the database
kb_1 = /home/trondaks/w/cdsware/modules/bibrank/etc/cern_jif.kb
kb_2 = /home/trondaks/w/cdsware/modules/bibrank/etc/cdsware_jif.kb
kb_2_filter = /home/trondaks/w/cdsware/modules/bibrank/etc/convert.kb
kb_3 = SELECT id_bibrec,value FROM bib93x,bibrec_bib93x WHERE tag='938__f' AND id_bibxxx=id
kb_4 = SELECT id_bibrec,value FROM bib21x,bibrec_bib21x WHERE tag='210__a' AND id_bibxxx=id
##This points to the url above (the common part of the url is 'url_' followed by a number
kb_5 = url_%s
##This is the part that will be read by the bibrankgkb tool to determine what to read.
##The first two part (separated by ,,) gives where to look for the convertion file (which convert
##the names between to formats), and the second part is the datasource. A convertion file is not
##needed, as shown in create_0. If the source is from a file, url or the database, it must be
##given with file,www or db. If several create lines exists, each will be read in turn, and added
##to a common kb file.
##So this means that:
##create_0: Load from file in variable kb_1 without convertion
##create_1: Load from file in variable kb_2 using convertion from file kb_2_filter
##create_3: Load from www using url in variable kb_5 and regular expression in url_regexp
##create_4: Load from database using sql statements in kb_4 and kb_5
create_0 = ,, ,,file,,%(kb_1)s
create_1 = file,,%(kb_2_filter)s,,file,,%(kb_2)s
#create_2 = ,, ,,www,,%(kb_5)s,,%(url_regexp)s
#create_3 = ,, ,,db,,%(kb_4)s,,%(kb_4)s
When you have found a source for the data, created the configuration file, it may be necessary to
create an convertion file, but this depends on the coversions used in the available data versus
the convertion used in your cdsware installation.
The available data may look like this:
COLLOID SURFACE A---1.98
But in cdsware you are using:
Colloids Surf., A---1.98
By using a convertion file like:
COLLOID SURFACE A---Colloids Surf., A
You can convert the source to the correct naming convention.
Colloids Surf., A---1.98
Step 3 - Run tool
When ready to run the tool, you may either use the default file (/etc/bibrank/bibrankgkb.cfg), or use another one by giving it using the input variable '--input'.
If you want to test the configuration, you can use '--verbose=9' to output on screen, or if you want to save it to a file, use
'--output=filename', but remember that the file will be saved in the programdirectory.
The output may look like this:
$ ./bibrankgkb -v9
2004-03-11 17:30:17 --> Running: Generate Knowledgebase.
2004-03-11 17:30:17 --> Reading data from file: /log/cdsware-DEMODEV/etc/bibrank/jif.kb
2004-03-11 17:30:17 --> Reading data from file: /log/cdsware-DEMODEV/etc/bibrank/conv.kb
2004-03-11 17:30:17 --> Using last resource for converting values.
2004-03-11 17:30:17 --> Reading data from file: /log/cdsware-DEMODEV/etc/bibrank/jif2.kb
2004-03-11 17:30:17 --> Converting between naming conventions given.
2004-03-11 17:30:17 --> Colloids Surf., A---1.98
2004-03-11 17:30:17 --> Phys. Rev. Lett.---6.462
2004-03-11 17:30:17 --> J. High Energy Phys.---8.664
2004-03-11 17:30:17 --> Nucl. Instrum. Methods Phys. Res., A---0.964
2004-03-11 17:30:17 --> Phys. Lett., B---4.213
2004-03-11 17:30:17 --> Phys. Rev., D---3.838
2004-03-11 17:30:17 --> Total nr of lines: 6
2004-03-11 17:30:17 --> Time used: 0 second(s).
6. Additional Information
BibRank Internals
diff --git a/modules/bibsched/doc/admin/guide.html.wml b/modules/bibsched/doc/admin/guide.html.wml
index c8d1fd319..df810b2a5 100644
--- a/modules/bibsched/doc/admin/guide.html.wml
+++ b/modules/bibsched/doc/admin/guide.html.wml
@@ -1,65 +1,69 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="BibSched Admin Guide" \
navtrail_previous_links="/admin/> > /admin/bibsched/>BibSched Admin" \
navbar_name="admin" \
navbar_select="bibsched-admin-guide"
This Admin Guide is not yet completed. If you are interested
in seeing some specific things implemented with high priority,
please contact us at . Thanks for your interest!
|
-BibSched -- the bibliographic task scheduler -- is central unit of the
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
+
Overview
+
+BibSched -- the bibliographic task scheduler -- is central unit of the
system that allows all other modules to access the bibliographic
database in a controlled manner, preventing sharing violation threats
and assuring the coherent execution of the database update tasks. The
module comes with an administrative interface that allows to monitor
the task queue including various possibilities of a manual
intervention, for example to re-schedule queued tasks, change the task
order, etc.
You can run the administrative interface by doing:
$ bibsched
The bibsched
can run in two modes: auto and manual. In
the auto mode, it will execute tasks automatically as they arrive in
the waiting queue. In the manual mode, the administrator has to
launch the tasks manually.
diff --git a/modules/bibupload/doc/admin/guide.html.wml b/modules/bibupload/doc/admin/guide.html.wml
index 78abe2a59..7df6798da 100644
--- a/modules/bibupload/doc/admin/guide.html.wml
+++ b/modules/bibupload/doc/admin/guide.html.wml
@@ -1,116 +1,120 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="BibUpload Admin Guide" \
navtrail_previous_links="/admin/> > /admin/bibupload/>BibUpload Admin" \
navbar_name="admin" \
navbar_select="bibupload-admin-guide"
This Admin Guide is not yet completed. Moreover, some
admin-level functionality for this module exists only in the form of
manual recipes. We are in the process of developing both the
guide as well as the web admin interface. If you are interested
in seeing some specific things implemented with high priority,
please contact us at . Thanks for your interest!
|
-BibUpload enables you to upload bibliographic data in XML MARC
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
+
Overview
+
+BibUpload enables you to upload bibliographic data in XML MARC
format into CDSware bibliographic database.
Configuring BibUpload
There is nothing to be configured at the moment. All the data
upload configuration is usually done when transforming the data via BibConvert.
NOTE: Please note that BibUpload currently assumes
037 $a
tag to be a "primary report number" that is unique
throughout the system. Therefore, if you upload two records with the
same 037 $a
tag value, it will override the exising
record with the new one. See the beginning of the BibUpload file to
know more.
More advanced BibUpload configuration functionality will be
included later.
Running BibUpload
Consider that you have an XML MARC file that is to be uploaded into
the CDSware. (For example, it might have been produced by BibConvert.) To finish the upload, you
would call the BibUpload script as follows:
$ bibupload -i file.xml
For available command-line options, see bibupload
--help
.
BibUpload Modes
FIXME
-i, --insertrecord Insert records from XML MARC file as new into the system.
Signals error if record already exists (see the -m matching
option below on how this is decided).
-r, --replacerecord Replace existing records by those from the XML MARC file.
The original content is wiped out and fully replaced.
Signals error if record is not found via -m matching criteria.
Note also that `-r' can be combined with `-i' into an `-ir' option
that would automatically either insert records as new if they are
not found in the system, or correct existing records if they
are found to exist.
-a, --appendfield Append fields from XML MARC file at the end of existing records.
The original content is enriched only.
Signals error if record is not found via -m matching criteria.
-c, --correctfield Correct fields of existing records by those from XML MARC file.
The original record content is modified only in the fields
from the XML MARC file: the original fields are removed and replaced
by those from the XML MARC file. Fields not present in XML MARC file
are not changed (unlike the -r option).
Signals error if record is not found via -m matching criteria.
-f, --format Upload only the format (FMT) fields.
The original content is not changed, and neither its modification date.
diff --git a/modules/webaccess/doc/admin/guide.html.wml b/modules/webaccess/doc/admin/guide.html.wml
index e4a9faa70..a3ef5ef81 100644
--- a/modules/webaccess/doc/admin/guide.html.wml
+++ b/modules/webaccess/doc/admin/guide.html.wml
@@ -1,804 +1,803 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="WebAccess Admin Guide" \
navtrail_previous_links="/admin/>Admin Area > /admin/webaccess/>WebAccess Admin " \
navbar_name="admin" \
navbar_select="webaccess-admin-guide"
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
-WEBACCESS ADMIN GUIDE / $Date$
-
1. Introduction, using roles
2. WebAccess admin interface
3. Example pages, illustrating snapshots
1. INTRODUCTION, USING ROLES
WebAccess is a common RBAC, role based access control, for all of
CDSware. This means that users are connected to roles that cover
different areas of access. I.e administrator of the photo
collection or system librarian. Users can be active in
different areas and of course connected to as many roles as needed.
The roles are connected to actions. An action identifies a task you
can perform in CDSware. It can be defined to take any number of
arguments in order to more clearly describe what you are allowing
connected users to do.
For example the system librarian can be allowed to run bibwords on
the different indexes. To allow system librarians to run the
bibwords indexing on the field author we connect role system
librarian with action runbibwords using the argument
index='author'.
WebAccess is based on allowing users to perform actions. This means
that only allowed actions are stored in the access control engine's
database.
2. WEBACCESS ADMIN INTERFACE
All the WebAccess Administration web pages have certain
features/design choices in common
- Divided into steps
The process of adding new authorizations/information is
stepwise. The subtitle contains information about wich step you are
on and what you are supposed to do.
- Restart from any wanted step
You can always start from an earlier step by simply clicking the
wanted button. This is not a way to undo changes! No information
about previous database is kept, so all changes are definite.
- Change or new entry must confirmed
On all the pages you will be asked to confirm the change, with
information about what kind of change you are about to perform.
- Links to other relevant admin areas on the right side
To make it easier to perform your administration tasks, we have
added a menu area on the right hand side of these pages. The menu
contain links to other relevant admin pages and change according to
the page you are on and the information you have selected.
3. EXAMPLE PAGES
I. Role area
II. Example - connecting role and user
I. Role area
Administration tasks starts in one of the administration areas. The
role area is the main area from where you can perform all your
managing tasks. The other admin areas are just other ways of
entering.
Role Administration
- Users:
- add or remove users from the access to a role and its priviliges.
- Authorizations/Actions:
- these terms means almost the same, but an authorization is a
connection between a role and an action (possibly) containing arguments.
- Roles:
- see all the information attached to a role and decide if you want to
delete it.
|
- Create new role
- go here to add a new role.
- Create new action
- go here to add a new action.
|
|
II. Example - connecting role and user
One of the important tasks that can be handled via the WebAccess Admin Web Interface
is the delegation of access rights to users. This is done by connecting them to the
different roles offered.
The task is divided into 5 simple and comprehensive steps. Below follows the pages from
the different steps with comments on the ongoing procedure.
- step 1 - select a role
You must first select the role you want to connect users to. All the available roles are
listed alfabetically in a select box. Just find the wanted role and select it. Then click on
the button saying "select role".
If you start from the Role Area, this step is already done, and you start directly on step 2.
- step 2 - search for users
As you can see, the subtitle of the page has now changed. The subtitle always tells you
which step you are on and what your current task is.
There can be possibly thousands of users using your online library, therefore it is important
to make it easier to identify the user you are looking for. Give part of, or the entire search
string and all users with partly matching e-mails will be listed on the next step.
You can also see that the right hand menu has changed. This area is always updated with links
to related admin areas.
- step 3 - select a user.
The select box contains all users with partly matching e-mail adresses. Select the one
you want to connect to the role and continue.
Notice the navigation trail that tells you were on the Administrator pages you are currently
working.
- step 4 - confirm to add user
All WebAccess Administrator web pages display the action you are about to peform, this
means explaining what kind of addition, change or update will be done to your access control
data.
If you are happy with your decision, simply confirm it.
- step 5 - confirm user added.
The user has now been added to this role. You can easily continue adding more users to this
role be restarting from step 2 or 3. You can also go directly to another area and keep working
on the same role.
- we are done
This example is very similar to all the other pages where you administrate WebAccess. The pages
are an easy gateway to maintaing access control rights and share a lot of features.
- divided into steps
- restart from any wanted step (not undo)
- changes must be confirmed
- link to other relevant areas
- prevent unwanted input
As an administrator with access to these pages you are free to manage the rights any way you want.
- end of file -
diff --git a/modules/webalert/doc/admin/guide.html.wml b/modules/webalert/doc/admin/guide.html.wml
index f7d7ef41b..4a520a8a8 100644
--- a/modules/webalert/doc/admin/guide.html.wml
+++ b/modules/webalert/doc/admin/guide.html.wml
@@ -1,77 +1,79 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="WebAlert Admin Guide" \
navtrail_previous_links="/admin/> > /admin/webalert/>WebAlert Admin" \
navbar_name="admin" \
navbar_select="webalert-admin-guide"
This Admin Guide is not yet completed. Moreover, some
admin-level functionality for this module exists only in the form of
manual recipes. We are in the process of developing both the
guide as well as the web admin interface. If you are interested
in seeing some specific things implemented with high priority,
please contact us at . Thanks for your interest!
|
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Overview
users may set up an automatic notification email alerts
that would send them documents corresponding to the user profile by
email either daily, weekly, or monthly. It is the job of the WebAlert
module to permit this functionality.
Configuring Alert Queries
Users may set up alert queries for example from their search history pages.
Administrators may edit existing users' alerts by modifying the
user_query_basket
table. (There is no web interface yet
for this task.)
Running Alert Engine
The alert engine has to be run each day in order to send users
email notifications for the alerts they have set up:
$ alertengine
HINT: You may want to set up an external cron job
to call alertengine
each day.
diff --git a/modules/webbasket/doc/admin/guide.html.wml b/modules/webbasket/doc/admin/guide.html.wml
index 141c8806f..8cc5d7226 100644
--- a/modules/webbasket/doc/admin/guide.html.wml
+++ b/modules/webbasket/doc/admin/guide.html.wml
@@ -1,27 +1,29 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="WebBasket Admin Guide" \
navtrail_previous_links="/admin/> > /admin/webbasket/>WebBasket Admin" \
navbar_name="admin" \
navbar_select="webbasket-admin-guide"
-Not implemented yet. If you want to manipulate user baskets, see
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
+
Not implemented yet. If you want to manipulate user baskets, see
tables user_basket, basket, basket_record
.
diff --git a/modules/websession/doc/admin/guide.html.wml b/modules/websession/doc/admin/guide.html.wml
index 5f5642cc3..37cf7bfaa 100644
--- a/modules/websession/doc/admin/guide.html.wml
+++ b/modules/websession/doc/admin/guide.html.wml
@@ -1,63 +1,65 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="WebSession Admin Guide" \
navtrail_previous_links="/admin/> > /admin/websession/>WebSession Admin" \
navbar_name="admin" \
navbar_select="websession-admin-guide"
This Admin Guide is not yet completed. Moreover, some
admin-level functionality for this module exists only in the form of
manual recipes. We are in the process of developing both the
guide as well as the web admin interface. If you are interested
in seeing some specific things implemented with high priority,
please contact us at . Thanks for your interest!
|
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Guest User Sessions
Guest users create a lot of entries in tables that are
related to their web sessions, their search history, personal baskets,
etc. This data has to be garbage-collected periodically. At the
moment this is done via a command line program:
$ sessiongc
HINT: You may want to launch this command every day.
In the future the garbage collection task may be done via BibSched task
queue.
diff --git a/modules/webstyle/doc/admin/guide.html.wml b/modules/webstyle/doc/admin/guide.html.wml
index a11a75bee..a9abb1c82 100644
--- a/modules/webstyle/doc/admin/guide.html.wml
+++ b/modules/webstyle/doc/admin/guide.html.wml
@@ -1,54 +1,56 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "cdspage.wml" \
title="WebStyle Admin Guide" \
navtrail_previous_links="/admin/> > /admin/webstyle/>WebStyle Admin" \
navbar_name="admin" \
navbar_select="webstyle-admin-guide"
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Compile-time Configuration of Page Layout
The style of the CDSware installation is defined largely also
during the configuration time, as explained in the INSTALL guide. You
can modify page header, footer, general portalboxes, etc. See the
installation guide for more details.
Messages emited by the web interface are to be edited during
configuration time in the messages.wml
file.
Run-time Configuration of Page Layout
During runtime you most probably want to modify mostly the CDS style sheet and images.
The look of the search interface pages may be modify to a very
large extent in the WebSearch Admin
Interface by adding portalboxes on various places on the page.
Advanced Page Layout Changes
More advanced changes to the web page layout have to be carried out
on the programming level. For example, most mod_python dynamic pages
are using the page()
function defined in the
webpage.py
file.
diff --git a/modules/websubmit/doc/admin/index.html.wml b/modules/websubmit/doc/admin/index.html.wml
index 4f6a84372..82e76fc6f 100644
--- a/modules/websubmit/doc/admin/index.html.wml
+++ b/modules/websubmit/doc/admin/index.html.wml
@@ -1,82 +1,84 @@
## $Id$
## This file is part of the CERN Document Server Software (CDSware).
## Copyright (C) 2002 CERN.
##
## The CDSware is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## The CDSware is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDSware; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#include "configbis.wml"
#include "cdspage.wml" \
title="" \
navtrail_previous_links="/admin/> > /admin/websubmit/>" \
navbar_name="admin" \
navbar_select="websubmit-admin-guide"
+Version <: print generate_pretty_revision_date_string('$Id$'); :>
+
Table of Contents