diff --git a/modules/bibformat/doc/admin/Makefile.am b/modules/bibformat/doc/admin/Makefile.am index 054b03438..a34d62367 100644 --- a/modules/bibformat/doc/admin/Makefile.am +++ b/modules/bibformat/doc/admin/Makefile.am @@ -1,35 +1,37 @@ ## $Id$ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. +## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. docdir = $(localstatedir)/www/admin/bibformat -gifsdir=$(localstatedir)/www/admin/bibformat - doc_DATA = index.html guide.html -IMGS = $(wildcard $(srcdir)/*.gif $(srcdir)/*.jpg $(srcdir)/*.png) +imgdir = $(localstatedir)/www/img/admin/ +img_DATA = $(wildcard $(srcdir)/*.gif $(srcdir)/*.jpg $(srcdir)/*.png) -gifs_DATA = $(IMGS:$(srcdir)/%=%) +webdoclibdir = $(libdir)/webdoc/admin +webdoclib_DATA = \ + bibformat-admin.webdoc \ + bibformat-admin-guide.webdoc FILESWML = $(wildcard $(srcdir)/*.wml) -EXTRA_DIST = $(FILESWML:$(srcdir)/%=%) $(gifs_DATA) +EXTRA_DIST = $(img_DATA) $(webdoclib_DATA) -CLEANFILES = $(doc_DATA) *~ *.tmp +CLEANFILES = *~ *.tmp %.html: %.html.wml $(top_srcdir)/config/config.wml $(top_builddir)/config/configbis.wml $(top_srcdir)/config/cdsnavbar.wml $(WML) -o\(ALL-LANG_*\)+LANG_EN:$@ $< $(PYTHON) $(top_srcdir)/po/i18n_update_wml_target.py en $@ diff --git a/modules/bibformat/doc/admin/bibformat-admin-guide.webdoc b/modules/bibformat/doc/admin/bibformat-admin-guide.webdoc new file mode 100644 index 000000000..d9a17874d --- /dev/null +++ b/modules/bibformat/doc/admin/bibformat-admin-guide.webdoc @@ -0,0 +1,1250 @@ +## $Id$ + +## This file is part of CDS Invenio. +## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. +## +## CDS Invenio is free software; you can redistribute it and/or +## modify it under the terms of the GNU General Public License as +## published by the Free Software Foundation; either version 2 of the +## License, or (at your option) any later version. +## +## CDS Invenio is distributed in the hope that it will be useful, but +## WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +## General Public License for more details. +## +## You should have received a copy of the GNU General Public License +## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., +## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. + + + + + + + +

Contents

+ + + + + + +

1. Overview

+

1.1 How BibFormat Works

+

BibFormat is in charge of formatting the bibliographic records that +are displayed to your users. It is called by the search engine when it has to +format a record.

+ +

As you might need different kind of formatting depending +on the type of record, but potentially have a huge amount of records in your database, you cannot specify +for each of them how they should look. Instead BibFormat uses a rule-based decision process +to decide how to format a record.
+The best way to understand how BibFormat works is to have a look at +a typical workflow:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Step 1:
+ http://cdsweb.cern.ch/search?recid=946417&ln=en&of=hd + When CDS Invenio has to display a record, it + asks BibFormat to format the record with the given output format + and language. For example here the requested output format is + hd, which is a short code + for "HTML Detailed". This means that somehow a user arrived on + the page of the record and asked for a detailed view of the + record.

Step 2:
1. Use Template [Picture HTML Detailed] if tag [980__a] is equal to [PICTURE] 2. Use Template [Thesis HTML detailed] if tag [980__a] is equal to [THESIS] 3. By default use [Default HTML Detailed] + + + Beside is a screenshot of the "hd" or "HTML Detailed" output format. + You can see that the output format does not specify how to format the record, but + contains a set of rules which define which template must be used.
+ The rules are evaluated from top to bottom. + Each rule defines a condition on a field of the record, and a format template to use to + format the record if the condition matches. + Let's say that the field 980__a of the record is equal to + "Picture". Then first rules matches, and format template + Picture HTML Detailed is + used for formatting by BibFormat.
+ You can add, remove or edit output formats here

Step 3:
+
+ <h1 align="center"><BFE_MAIN_TITLE/></h1>
+ <p align="center">
+ <BFE_AUTHORS separator="; " + link="yes"/><br/>
+ <BFE_DATE format="%d %B %Y"> .- <BFE_NB_PAGES suffix="p">
+ </p>
We see an extract of the Picture HTML Detailed format on the right, + as it is shown in the template editor. As you can see it + is mainly written using HTML. There are however some tags that + are not part of standard HTML. Those tags that starts with + <BFE_ are placeholders for the record values. For + example <BFE_MAIN_TITLE/> tells BibFormat to write the title + of the record. We call these tags "elements". Some + elements have parameters. This is the case of the <BFE_AUTHORS> element, + which can take separator and link as + parameters. The value of separator will be used to separate + authors' names and the link parameter tells if links to authors' + websites have to be created. + All elements are described in the elements documentation.
+ You can add, remove or edit format templates here. +

+ In addition to this modified HTML language, BibFormat also supports XSL stylesheets as format templates. + Read the XSL Format Templates section to learn more about XSLT support for your format templates. +

+

Step 4:
+ def format(bfo, separator='; ', link='no'):
+    """
+    Prints the list of authors for the record

+    @param separator a character to separate the authors
+    @param link if 'yes' print HTML links to authors
+    """
+    authors = bfo.fields("100__a")
+    if link == 'yes':
+       authors = map(lambda x: '<a href="'+weburl+'/search?f=author&p='\
+                   + + quote(x) +'">'+x+'</a>', authors)
+    return authors.split(separator)
+
A format element is written in Python. It acts as a bridge + between the record in the database and the format + template. Typically you will not have to write or read format + elements, just call them from the templates. Each element outputs + some text that is written in the template where it is called.
+ Developers can add new elements by creating a new file, naming it + with the name of element, and write a Python format + function that takes as parameters the parameters of the elements + plus a special one bfo. Regular Python code can be + used, including import of other modules.
+
+ +

In summary BibFormat is called by specifying a record and an output +format, which relies on different templates to do the formatting, and +which themselves rely on different format elements. Only developers need to modify +the format elements layer.

+ + + + + + + + + + + + + + + + +
Output Format
Template
Template
Format Element
Format Element
Format Element
Format Element
+

You should now understand the philosophy behind BibFormat.

+ +

1.2 Short Tutorial

+

Let's try to create our own format. +This format will just print the title of a record. +

+ +

First go to the main BibFormat admin page. +Then click on the "Manage Ouput Format" links. You will see the list of all output formats:

+Output formats management page +

This is were you can delete, create or check output formats. +The menu at the top of the page let you go to other admininistration pages.
+Click on the "Add New Output Format" button at the bottom of the page. You can then fill in some attributes +for the output format. Choose "title" as code, "Only Title" as name and "Prints only title" as description:

+Screenshot of the Update Output Format Attributes page +

Leave other fields blank, and click on the button "Update Output format Attributes".
You are then +redirected to the rules editor. Notice the menu at the top which let you close the editor, change the attributes again +and check the output format. However do not click on these links before saving your modification of rules!

+Output format menu +

As our format does not need to have a different behaviour depending on the record, we do not need to add new rules to the format. You just need to select a format template in the "By default use" list. However we first have to create our special format template that only print titles. So close the editor using the menu at the top of the page, and in the menu that just appeared instead, click on "Manage Format Templates". In a similar way to output formats, you see the list of format templates.

+Format template management page +

+Click on the "Add New Format Template" button at the bottom of the page. As for the output format, fill in the attributes of the template with name "Title" and any relevant description. +

+ +update format template attributes +

Click on the "Update Output Format Attributes" button. You are redirected to the template editor. The editor is divided in three parts. The upper left part contains the code of the template. The bottom part is a preview of the template. The part on the right side is a short remainder of the format elements you can use in you template. You can hide this documentation by clicking on "Hide Documentation".

+Format template editor +

The above screenshot shows the template code already filled in. It calls the BFE_TITLE element. If you do not know the name of the element you want to call, you can search for it using the embedded documentation search. You can try to add other elements into your template, or write some HTML formatting.

+

When you are satisfied with your template, click on the save button, close the editor and go back to the "Only titles" output format rules editor. There select the template you have just created in the "Use by default" menu and save the ouput format and you are done.

+

This tutorial does not cover all aspects of the management of formats (For example "Knowledge bases" or internationalization). It also does not show all the power of output formats, as the one we have created simply call a template. However you have seen enough to configure BibFormat trough the web interface. Read the sections below to learn more about it.

+ +

1.3 Administer Through the Web Interface or Through the Configuration files

+

BibFormat can be administered in two ways. The first way is to use the provided web interface. It should be the most +convenient way of doing for most users. The web interface is simple to use and provides great tools to manage your formats. Its only limitation concerns the format elements, which cannot be modified using it (But the web interface provide a dynamically generated documentation of your elements).
+The other way to administer BibFormat is to directly modify the configuration files using your preferred text editor. This way of doing can bring much power to advanced users, but requires an access to the server's files. It also requires that the user double-check his modifications, or use the web interface to ensure the validity and correctness of his formats.

+

In this manual we will show both ways of doing. For each explication we show first how to do it through the web interface, then how to do it by manipulating the configuration files. Non-power users can stop reading as soon as they encounter the text "For developers and adventurers only".

+

We generally recommend to use the web interface, excepted for writing +format elements.

+ + +

2. Output Formats

+

As you potentially have a huge amount of +bibliographic records, you cannot specify manually for each of them +how it should be formatted. This is why you can define rules that will +allow BibFormat to understand which kind of formatting to apply to a given +record. You define this set of rules in what is called an "output +format".

+ +

You can have different output formats, each with its own characteristics. +For example you certainly want that when multiple bibliographic records are +displayed at the same time (as it happens in search results), only +short versions are shown to the user , while a detailed record is +preferable when a single record is displayed, whatever the type of the record.
+You might also want to +let your users decide which kind of output they want. For example you +might need to display HTML for regular web browsing, but would also +give a BibTeX version of the bibliographic reference for direct +inclusion in a LaTeX document.

+

To summarize, an output format groups similar kind of formats, specifying which kind +of formatting has to be done, but not how it has to be done.

+ +

2.1 Add an Output Format

+

To add a new output format, go to the Manage Output Formats page and click on the "Add New Output Format" button at the bottom of the page. The format has been created. You can then specify the attributes of the output format. See Edit the Attributes of an Output Format to learn more about it.

+ For developers and adventurers only: +

Alternatively you can directly add a new output format file into the + /etc/bibformat/outputs/ directory of your CDS Invenio installation, if you have + access to the server's files. Use the format extension .bfo for your file.

+

You should also check that user www-data has read/write access to the file, + if you want to be able to modify the rules through the web interface.

+ +

2.2 Remove an Output Format

+

To remove an output format, go to the Manage Output Formats page and click on the "Delete" button facing the output format you want to delete. If you cannot click on the button (the button is not enabled), this means that you do not have sufficent priviledge to do so (Format is protected. Contact the administrator of the system).

+For developers and adventurers only: +

You can directly remove an output format from the /etc/bibformat/outputs/ directory of your CDS Invenio installation. +However you must make sure that it is removed from the tables format and formatname in the database, so that other modules know that it is not longer available.

+ +

2.3 Edit the Rules of an Output Format

+

When you create a new output format, you can at first only specify the default template, + that is the one which is used when all rules fail. In the case of a basic output format, + this is enough. You can however add other rules, by clicking on the "Add New Rule" button.
+ Once you have added a rule, you can fill it with a condition, and a template that should be used + if the condition is true. For example the rule

+ Rule: Use template [Picture HTML Detailed] if field [980__a] is equal to [PICTURE] +

will use template named "Picture HTML Detailed" if the field 980__a of the record to format is equal to "Picture". + Note that text "PICTURE" will match any letter case like "picture" or "Picture". + Leading and trailing spaces are ignored too (" Picture " will match "PICTURE"). +
Tips: you can use a regular expression as text. For example "PICT.*" will match "pictures" + and "PICTURE".

+ +

Reorder rules using arrows
+ The above configuration will use format template "Default HTML Detailed" if all above rules fail (in that case + if field 980__a is different from "PICTURE"). If you have more rules, you decide in which order the conditions are evaluated. You can reorder rules by clicking on the small arrows on the left of the rules. +

+

Note that when you are migrating your output formats from the old PHP BibFormat, you might not have translated all the formats to which your output formats refers. In that case you should use defined in old BibFormat option in the format templates menu, to make BibFormat understand that a match for this rule must trigger a call to the Behaviour of the old BibFormat. See section on Run old and new formats side by side for more details on this.

+ For developers and adventurers only: +

To write an output format, use the following syntax:
+ First you + define which field code you put as the conditon for the rule. + You suffix it with a column. Then on next lines, define the values of + the condition, followed by --- and then the filename of the template + to use:

+
+  tag 980__a:
+  PICTURE --- PICTURE_HTML_BRIEF.bft
+  PREPRINT --- PREPRINT_HTML_BRIEF.bft
+  PUBLICATION --- PUBLICATION_HTML_BRIEF.bft
+
+

+ This means that if value of field 980__a is equal to PICTURE, then we + will use format template PICTURE_HTML_BRIEF.bft. Note that you must + use the filename of the template, not the name. Also note that spaces + at the end or beginning are not considered. On the following lines, + you can either put other conditions on tag 980__a, or add another tag on + which you want to put conditions.

+

At the end you can add a default condition:

+
+   default: PREPRINT_HTML_BRIEF.bft
+
+

which means that if no condition is matched, a format suitable for + Preprints will be used to format the current record.

+ +

The output format file could then look like this:

+
+  tag 980__a:
+  PICTURE --- PICTURE_HTML_BRIEF.bft
+  PREPRINT --- PREPRINT_HTML_BRIEF.bft
+  PUBLICATION --- PUBLICATION_HTML_BRIEF.bft
+
+  tag 8560_f:
+  .*@cern.ch --- SPECIAL_MEMBER_FORMATTING.bft
+
+  default: PREPRINT_HTML_BRIEF.bft
+
+

You can add as many rules as you want. Keep in mind that they are read + in the order they are defined, and that only first rule that + matches will be used. + Notice the condition on tag 8560_f: it uses a regular expression to + match any email address that ends with @cern.ch (the regular + expression must be understandable by Python) +

+

2.4 Edit the Attributes of an Output Format

+

An output format has the following attributes: +

+

Please read this information regarding output format codes: + There are some reserved codes that you should not use, or at least be aware of when choosing a code for your + output format. The table below summarizes these special words: +

+

+ + + + + + + + + + + + + + + +
CodePurpose
HBUsed for displaying list of results of a search.
HDUsed when no format is specified when viewing a record.
HMUsed for Marc output. The format is special in the sense that it filters + fields to display according to the 'ot' GET parameter of the HTTP request.
Starting with letter 't'Used for displaying the value of the field specified by the 'ot' GET parameter of the HTTP request.
Starting with 3 digitsUsed for displaying the value of the field specified by the digits.
+
+

+ For developers and adventurers only: +

Excepted for the code, output format attributes cannot be changed in the output format file. These + attributes are saved in the database. As for the code, it is the name of the output format file, + without its .bfo extension. If you change this name, do not forget to propagate the modification in the database.

+

2.5 Check the Dependencies an Output Format

+

To check the dependencies of an output format on format templates, format elements and tags, + go to the Manage Output Formats page, click on + the output format you want to check, and then in the menu click on "Check Dependencies".

+ Check Dependencies menu +

The next page shows you: +

+ Note that some Marc tags might be omitted.

+

2.6 Check the Validity an Output Format

+

To check the validity of an output format, simply go to the Manage Output Formats page, and look at the column 'status' for the output format you want to check. If message "Ok" is there, + then no problem was found with the output format. If message 'Not Ok' is in the column, click on it to see + the problems that have been found for the output format.

+ +

3. Format Templates

+

A format template defines how a record should be formatted. For example it specifies which fields of the record are to be displayed, in which order and with which visual attributes. Basically the format template is written in HTML, so that it is easy for anyone to edit it. BibFormat also has support for XSLT for formatting. Read more about XSL format templates here.

+

3.1 Add a Format Template

+

To add a new format template, go to the Manage Format Templates page and click on the "Add New Format Template" button at the bottom of the page. The format has been created. You can then specify the attributes of the format template, or ask to make a copy of an existing format. + See Edit the Attributes of a Format Template to learn more about editing the attributes.

+ For developers and adventurers only: +

Alternatively you can directly add a new format template file into the + /etc/bibformat/format_templates/ directory of your CDS Invenio installation, if you have + access to the server's files. Use the format extension .bft for your file.

+

You should also check that user www-data has read/write access to the file, + if you want to be able to modify the code and the attributes of the template through the web interface.

+ +

3.2 Remove a Format Template

+

To remove a format template, go to the Manage Format Templates page and click on the "Delete" button facing the format template you want to delete. If you cannot click on the button (the button is not enabled), this means that you do not have sufficent priviledge to do so (Format is protected. Contact the administrator of the system).

+For developers and adventurers only: +

You can directly remove the format template from the /etc/bibformat/format_templates/ directory of your CDS Invenio installation.

+

3.3 Edit the Code of a Format Template

+

You can change the formatting of records by modifying the code of a template. +

To edit the code of a format template + go to the Manage Format Templates page. Click on + the format template you want to edit to load the template editor.

+ +

The format template editor contains three panels. The left upper panel is the code editor. This is were + you write the code that specifies the formatting of a template. The right-most panel is a short documentation + on the "bricks" you can use in your format template code. The panel at the bottom of the page allows you to preview the template.

+ Template Editor Page +

The following sections explain how to write the code that specifies the formatting.

+

3.4 Basic Editing

+

The first thing you have to know before editing the code is that everything you write in the + code editor is printed as such by BibFormat. Well almost everything (as you will discover later).

+

For example if you write "My Text", then for every record the output will be "My Text". Now let's say + you write "<b>My Text</b>": the output will still be "<b>My Text</b>", but as we display in a web browser, it will look like + "My Text" (The browser interprets the text inside tags <b></b> as "bold". Also note that the look may depend on the CSS style of your page).

+

Basically it means that you can write HTML to do the formatting. If you are not experienced with HTML you can use an HTML editor to create your layout, and the copy-paste the HTML code inside the template.

+

Do not forget to save your work by clicking on the save button before you leave the editor!

+ For developers and adventurers only: +

+ You can edit the code of a template using exactly the same syntax as in the web interface. The code of the template + is in the template file located in the /etc/bibformat/format_templates/ directory of your CDS Invenio installation. You just + have to take care of the attributes of the template, which are saved in the same file as the code. See Edit the Attributes of a Format Template to learn more about it. +

+ +

3.5 Use Format Elements

+

To add a dynamic behaviour to your format templates, that is display for example a different title + for each record or a different background color depending on the type of record, you can use the format elements.

+

Format elements are the smart bricks you can copy-paste in your code to get the attributes of template + that change depending on the record. A format element looks like a regular HTML tag.

+

For example, to print + the title of a record, you can write <BFE_TITLE /> in your template code where you want to diplay the title

+

Format elements can take values as parameters. This allows to customize the behaviour of an element. For example you can write <BFE_TITLE prefix="Title: " />, and BibFormat will take care of printing the title for you, with prefix "Title: ". The difference between Title: <BFE_TITLE /> and <BFE_TITLE prefix="Title: " /> is that the first option will always write "Title: " while the second one will only print "Title: " if there exist a title for the record in the database. Of course there are chances that there is always a title for each record, but this can be useful for less common fields.

+

Some parameters are available for all elements. This is the case for the following ones: +

+

+

Some parameters are specific to elements. To get information on all available format elements you can read the Format Elements Documentation, which is generated dynamically for all existing elements. it will show you what the element do and what parameters it can take.

+

While format elements looks like HTML tags, they differ in the followings ways from traditional ones: +

+

+

Tips: you can use the special element <BFE_FIELD tag="" /> to print the value + of any field of a record in your templates. This practice is however not + recommended because it would necessitate to revise all format + templates if you did change the meaning of the MARC code schema.

+ +

3.6 Preview a Format Template

+

To preview a format template go to the Manage Format Templates page and click on the format template you want to preview to open the template editor. The editor contains a preview panel at the bottom of the page.

+ Preview Panel +

Simply click on " Reload Preview" button to preview the template (you do not need to save the code before previewing).
+ Use the "Language" menu to preview the template in a given language

+

You can fill in the "Search Pattern" field to preview a specific record. The search pattern uses exactly the same + syntax as the one used in the web interface. The only difference with the regular search engine is that only the first matching record is shown.

+ For developers and adventurers only: +

If you do not want to use the web interface to edit the templates but still would like to get previews, you can open the preview frame of any format in a new window/tab. In this mode you get a preview of the template (if it is placed in the /etc/bibformat/format_templates/ directory of your CDS Invenio installation). The parameters of the preview are specified in the url:

+

3.7 Internationalization (i18n)

+

You can add translations to your format templates. To do so enclose the text you want to localize + with tags corresponding to the two letters of the language. For example if we want to localize "title", write <en>Title</en>. Repeat this for each language in which you want to make "title" available: <en>Title</en><fr>Titre</fr><de>Titel</de>. + Finally enclose everything with <lang> </lang> tags: <lang><en>Title</en><fr>Titre</fr><de>Titel</de></lang>

+

For each <lang> group only the text in the user's language is displayed. If user's language is not +available in the <lang> group, your default CDS Invenio language is used.

+ +

3.8 Escaping special HTML/XML characters

+ +

By default, BibFormat escapes all values returned by format +elements. As a format template designer, you can assume in almost all +cases that the values you get from a format element will be escaped +for you. For special cases, you can set the parameter +escape of the element to '0' when calling it, to make +BibFormat understand that it must not escape the values of the +element, or to '1' to force the escaping.

+

+For example +<bfe_abstract /> will return:
+ +[...]We find that for spatially-flat cosmologies, background lensing
+clusters with reasonable mass-to-light ratios lying in the
+redshift range 0&lt;1 are strongly excluded, [...]

+while <bfe_abstract escape="0"/> will return:
+[...]We find that for spatially-flat cosmologies, background lensing
+clusters with reasonable mass-to-light ratios lying in the
+redshift range 0<1 are strongly excluded, [...]

+

+

In most cases, you will not set escape to 1, nor 0, but +just let the developer of the element take care of that for you.

+ +

Please note that values given in special parameters +prefix, suffix, default and +nbMax are never escaped, whatever the value of +escape is (but other parameters will). You have to take +care of that in your format template, as well as of all other values that +are not returned by the format elements.

+ +

3.9 Edit the Attributes of a Format Template

+

To edit the attributes of a format template + go to the Manage Format Templates page, click on + the format template you want to edit, and then in the menu click on "Modify Template Attributes".

+

+ A format template contains two attributes: +

+

Note that changing these parameters has no impact on the formatting. Their purpose in only to + document the template.

+

If the name you have chosen already exists for another template, you name will be suffixed with an integer so that the name is unique.

+

You should also be aware that if you change the name of a format template, all output formats that were linking to this template will be changed to match the new name.

+ For developers and adventurers only: +

You can change the attributes of a template by editing its file in the /etc/bibformat/format_templates/ directory of your CDS Invenio installation. The attributes must be enclosed with tags <name> </name> and <description> </description> and should ideally be placed at the beginning of the file.

+

Also note that the admin web interface tries to keep the name of the template in sync with the filename of the template. If the name is changed through the web interface, the filename of the template is changed, and all output formats that use this template are updated. You have to do update output formats manually if you change the filename of the template without the web interface.

+ +

3.10 Check the Dependencies of a Format Template

+

To check the dependencies of a format template + go to the Manage Format Template page, click on + the format template you want to check, and then in the menu click on "Check Dependencies".

+ Check Dependencies menu +

The next page shows you: +

+ Note that some Marc tags might be omitted.

+

3.11 Check the Validity a Format Template

+

To check the validity of a format template, simply go to the Manage Format Templates page, and look at the column 'status' for the format template you want to check. If message "Ok" is there, + then no problem was found with the template. If message 'Not Ok' is in the column, click on it to see + the problems that have been found for the template.

+ +

3.12 XSL Format templates

+

In addition to the HTML-like syntax introduced in previous sections, BibFormat also + has support for server-side XSL transformation. Although you can do all the formatting using this custom HTML syntax, there are + cases where an XSL stylesheet might be preferred. XSLT is for example a natural choice + when you need to output complex XML, especially when your XML has a deep tree structure. + You might also prefer using XSLT if you already feel comfortable with XSL syntax.

+

XSL format templates are written using regular XSL. The template file has to be placed in the same folder + as regular format template files, and its file extension must be .xsl. The XSL template + are also visible through the web interface, as any regular format template file. However, some + functions like the "Dependencies checker" or the possibility to create a template or edit its attributes are not + available for the XSL templates.

+ +

Finally, please note that you will need to install a supported XSLT parser in order + to format using XSL stylesheets.

+ +

4. Format Elements

+

Format elements are the bricks used in format templates to provide dynamic content to the formatting process. + Their purpose is to allow non computer literate persons to easily integrate data from the records in the database into their templates.

+

Format elements are typically written in Python (there is an exception to that point which is dicussed in Add a Format Element). This brings great flexibily and power to the formatting process. This however restricts the creation of format elements to developers.

+ +

4.1 Add a Format Element

+

The most typical way of adding a format element is to drop a .py file in the lib/python/invenio/bibformat_elements directory of your CDS Invenio installation. See Edit the Code of a Format Element to learn how to implement an element.

+

The most simple way to add a format element is to add a en entry in the "Logical Fields" management interface of the BibIndex module. When BibFormat cannot find the Python format element corresponding to a given name, it looks into this table for the name and prints the value of the field declared for this name. This lightweight way of doing is straightforward but does not allow complex handling of the data (it limits to printing the value of the field, or the values of the fields if multiple fields are declared under the same label).

+

4.2 Remove a Format Element

+

To remove a Python format element simply remove the corresponding file from the lib/python/invenio/bibformat_elements directory of your CDS Invenio installation.

+

To remove a format element declared in the "Logical Fields" management interface of the BibIndex module simply remove the entry from the table.

+

4.3 Edit the Code of a Format Element

+

This section only applies to Python format elements. Basic format elements declared in "Logical Fields" have non configurable behaviour.

+

A format element file is like any regular Python program. It has to implement a format function, which returns a string and takes at least bfo as first parameter (but can take as many others as needed).

+

Here is for example the code of the "bfe_title.py" element: +

+def format(bfo, separator=" "):
+    """
+    Prints the title of a record.
+
+    @param separator separator between the different titles
+    """
+    titles = []
+
+    title = bfo.field('245__a')
+    title_remainder = bfo.field('245__b')
+
+    titles.append( title + title_remainder )
+
+    title = bfo.field('0248_a')
+    if len(title) > 0:
+        titles.append( title )
+
+    title = bfo.field('246__a')
+    if len(title) > 0:
+        titles.append( title )
+
+    title = bfo.field('246_1a')
+    if len(title) > 0:
+        titles.append( title )
+
+    return separator.join(titles)
+
+In format templates this element can be called like a function, using HTML syntax:
+<BFE_TITLE separator="; "/>
+Notice that the call uses (almost) the filename of your element. To find out which element to use, BibFormat tries different filenames until the element is found: it tries to
    +
  1. ignore the letter case
  2. +
  3. replace underscore with spaces
  4. +
  5. remove the BFE_ from the name
  6. +
+ This means that even if the filename of your element is "my element.py", BibFormat can resolve the call <BFE_MY_ELEMENT /> in a format template. This also means that you must take care no to have two format elements filenames that only differ in term of the above parameters. +

+

The string returned by the format function corresponds to the value that is printed instead of the format element name in the format template.

+

The bfo object taken as parameter by format stands for BibFormatObject: it is an object that represents the context in which the formatting takes place. For example it allows to retrieve the value of a given field for the record that is being formatted, or the language of the user. We see the details of the BibFormatObject further below.

+

The format function of an element can take other parameters, as well as default values for these parameters. The idea is that these parameters are accessible from the format template when calling the elements, and allow to parametrize the behaviour of the format element.

+

It is very important to document your element: this allows to generate a documentation for the elements accessible to people writing format templates. It is the only way for them to know what your element do. The key points are: +

+

+

Typically you will need to get access to some fields of a record to display as output. There are two ways to this: you can access the bfo object given as parameter and use the provided (basic) accessors, or import a dedicated module and use its advanced functionalities.

+

Method 1: Use accessors of bfo:
+ bfo is an instance of the BibFormatObject class. The following methods are available: +

+You can also get access to other information through bfo, such as the language in which the formatting should occur with bfo.lang. To learn more +about the possibilities offered by the bfo, read the BibFormat APIs +

+

Method 2: Use module BibRecord:
+ + BibRecord is a module that provides advanced functionalities + regarding access to the field of a record + bfo.get_record() returns a structure that can be + understood by BibRecord's functions. Therefore you can import + the module's functions to get access to the fields you want. +

+

4.4 Preview a Format Element

+

+ You can play with a format element parameters and see the result + of the element directly in the + format elements documentation: + for each element, under the section "See + also", click on "Test this element". You are redirected to a page + where you can enter a value for the parameters. A description is + associated with each parameter as well as an indication of the + default value of the parameter if you do not provide a custom + value. Click on the "Test!" button to see the result of the + element with your parameters.

+ +

4.5 Internationalization (i18n)

+ +

You can follow the standard internationalization procedure in + use accross CDS Invenio sources. For example the following code + will get you the translation for "Welcome" (assuming "Welcome" has + been translated): +

+from invenio.messages import gettext_set_language
+
+ln = bfo.ln
+_ = gettext_set_language(ln)
+
+translated_welcome =  _("Welcome")
+
+

+ +

Notice the access to bfo.ln to get access to the + current language of the user. For simpler translations or + behaviour depending on the language you can simply check the value + bfo.ln to return your custom text.

+ +

4.6 Escaping special HTML/XML characters

+

In most cases, that is cases where your + element does not return HTML output, you do not have to take any + particular action in order to escape values that you output: the + BibFormat engine will take care of escaping the returned value of the element + for you. In cases where you want to return text that should not + be escaped (for example when you return HTML links), you can make + the formatting engine know that it should not escape your + value. This is done by implementing the + escape_values(bfo) function in your element, that + will return (int) 0 when escape should not be done (or 1 when + escaping should be done): + + +

def escape_values(bfo):
+    """
+    Called by BibFormat in order to check if output of this element
+    should be escaped.
+    """
+    return 0
+
+ + Note that the function is given a bfo object as + parameter, such that you can do additional testing if your element + should really return 1 or 0 (for very special cases).
Also note + that the behavior defined by the escape_values() function + will be overriden by the escape parameter used in the + format template if it is specified. +

+ +

Finally, be cautious when you disable escaping: you will have + to take care of escaping values "manually" in your format element + code, in order to avoid non valid outputs or XSS + vulnerabilities. This can be done easily when using the + field, fields and + controlfield functions of bfo with escape + parameter: + +

+    title = bfo.field('245__a', escape="1")
+    abstract = bfo.field('520__a', escape="2")
+
+ + The escape parameter can be one of the following values: + + These modes are the same for escape_values(bfo) function. +

+ You can also decide not to use the escape parameter and escape values + using any other Python function/library you want to use (such as cgi.escape()). +

+ + +

4.7 Edit the Attributes of a Format Element

+

A format element has mainly four kinds of attributes:

+

+

4.8 Check the Dependencies of a Format Element

+

There are two ways to check the dependencies of a format element. The simplest way is to go to the format elements documentation and click on "Dependencies of this element" for the element you want to check.

+

The second method to check the dependencies of an element is through regular unix tools: for example $ grep -r -i 'bfe_your_element_name' . inside the format templates directory will tell you which templates call your element.

+

4.9 Check the Validity of a Format Element

+

There are two ways to check the validity of an element. The simplest one is to go to the format elements documentation and click on "Correctness of this element" for the element you want to check.

+

The second method to check the validity of an element is through regular Python methods: you can for example import the element in the interactive interpreter and feed it with test parameters. Notice that you will need to build a BibFormatObject instance to pass as bfo parameter to the format function of your element.

+

4.10 Browse the Format Elements Documentation

+

Go to the format elements documentation. There is a summary of all available format elements at the top of the page. You can click on an element to go to its detailed description in the second part of the page.

+

Each detailed documentation shows you: +

+ +

5. Knowledge Bases

+

Knowledge bases are a way to define easily extendable repositories of mappings. Their use is various, but their main purpose is to get, given a value, the normalized version of this value. For example you may use a knowledge base to hold a list of all ways to abbreviate the name of a journal, and map these abbreviations to the full journal name. This would be useful to get a normalized journal name accross all of your records.

+

The knowledge base itself offers no method to do this normalization. It is limited to the archiving of this knowledge. To benefit from the normalization you need to use a format element which is knowledge-base-aware. The element will look by iteself into the knowledge base to format a record. In that way you can extend the formatting capabilities of this element without having to modify it.

+

5.1 Add a Knowledge Base

+

To add a knowledge base go to the Manage Knowledge Bases administration page. + At the bottom of the page click on the "Add New Knowledge Base" button. The knowledge base has been created and you are asked to fill in its attribute. See Edit the Attributes of a Knowledge Base to learn more about the attributes of knowledge bases.

+

5.2 Remove a Knowledge Base

+

To remove a knowledge base go to the Manage Knowledge Bases administration page. Click on the "Delete" button facing the knowledge base you want to remove and confim. The knowledge base and all the mapping it includes are removed.

+

5.3 Add a Mapping

+

Go to the Manage Knowledge Bases administration page and click on the knowledge base for which you want to add a mapping. Fill in the form of the "Add New Mapping" section on the left of the page with the new mapping, and click on "Add New Mapping". The mapping has been created. Alternatively you can create the mapping without its attributes, and fill them afterward (See Edit a Mapping).

+

5.4 Remove a Mapping

+

Go to the Manage Knowledge Bases administration page and click on the knowledge base for which you want to remove a mapping. Click on the "Delete" button facing the mapping you want to delete. +

5.5 Edit a Mapping

+

Go to the Manage Knowledge Bases administration page and click on the knowledge base for which you want to edit a mapping. Locate the mapping in the list. You can click on the column headers to order the list by Map From or by Map To to help you find it. Once you have edited the mapping click on the corresponding "Save" button. +

5.6 Edit the Attributes of a Knowledge Base

+ Go to the Manage Knowledge Bases administration page and click on the knowledge base you want to edit. In the top menu, click on "Knowledge Base Attributes". You can then give your knowledge base a name and a description. Finally click on the "Update Base Attributes" button. +

5.7 Check the Dependencies a Knowledge Base

+ To check the dependencies of a knowledge base + go to the Manage Knowledge Bases page, click on + the knowledge base you want to check, and then in the menu click on "Knowledge Base Dependencies".

+

The next page shows you the list of format elements that use this knowledge base.

+

Note that some format elements might be omitted.

+ +

6. Run BibReformat

+

While records can be formatted on-the-fly using BibFormat, it is usually necessary to preformat the records +in order to decrease the load of your server. To do so, use the bibreformat command line tool.

+

6.1 Run BibReformat

+

The following options are available for running bibreformat:

+

+
 Usage: bibreformat [options]
+ -u, --user=USER         User name to submit the task as, password needed.
+ -h, --help              Print this help.
+ -V, --version           Print version information.
+ -v, --verbose=LEVEL     Verbose level (0=min,1=normal,9=max).
+ -s, --sleeptime=SLEEP   Time after which to repeat tasks (no)
+ -t, --time=DATE         Moment for the task to be active (now).
+ -a, --all               All records
+ -c, --collection        Select records by collection
+ -f, --field             Select records by field.
+ -p, --pattern           Select records by pattern.
+ -o, --format            Specify output format to be (re-)created. (default HB)
+ -n, --noprocess         Count records to be processed only (no processing done)
+ Example: bibreformat -n Show how many records are to be bibreformated.
+
+
+For example, to reformat all records in HB (=HTML brief) format, you'd launch: +
+
$ bibreformat -a -oHB
+
+
+and you watch the progress of the process via bibsched. +

Note that BibReformat understands -p, -f, +and -c arguments that enable you to easily reformat only +the records you need. For example, to reformat the Pictures +collection, launch: + +

+

+

$ bibreformat -cPictures -oHB
+
+
+or to reformat HD (=HTML detailed) format for records #10 to #20, you +launch: +
+
$ bibreformat -p"recid:10->20" -oHD
+
+
+

Last but not least, if you launch bibreformat without arguments: +

+
$ bibreformat
+
+
+

+it will process all the records that have been modified since the last +run of BibReformat, as well as all newly inputted records. This is +suitable for running BibReformat in a periodical daemon mode via +BibSched. See our HOWTO Run +Your CDS Invenio Installation guide for more information.

+ +

7. Appendix

+

7.1 MARC Notation in Formats

+

The notation for accessing fields of a record are quite flexible. You can use a syntax strict regarding MARC 21, but also + a shortcut syntax, or a syntax that can have a special meaning.

+

The MARC syntax is the following one: + tag[indicator1][indicator2] [$ subfield] where tag is 3 digits, indicator1 and indicator2 are 1 character each, and subfield is 1 letter. +

+

For example to get access to an abstract you can use the MARC notation 520 $a. You can use this syntax in BibFormat. However you can also: +

+

7.2 Migrating from Previous BibFormat

+

The new Python BibFormat formats are not backward compatible with the previous formats. New concepts and capabilities have been introduced and some have been dropped. If you have not modified the "Formats" or modified only a +little bit the "Behaviours" (or modified "Knowledge Bases"), then the transition will be painless and +automatic. Otherwise you will have to manually rewrite some of the +formats. This should however not be a big problem. Firstly because the +CDS Invenio installation will provide both versions of BibFormat for +some time. Secondly because both BibFormat versions can run side by +side, so that you can migrate your formats while your server still +works with the old formats. Thirdly because we provide a migration +kit that can help you go through this process. Finally because the +migration is not so difficult, and because it will be much easier for +you to customize how BibFormat formats your bibliographic data.

+

The first thing you should do is to read the Five Minutes Introduction to BibFormat to understand how the new BibFormat works. We also assume that you are familiar with the concepts of the old BibFormat. As the new formats separate the presentation from the business logic (i.e. the bindings to the database), it is not possible to automatically handle the translation. This is why you should at least be able to read and understand the formats that you want to migrate.

+ +

Differences between old and new BibFormat

+

+The most noticeable differences are:
+
+ a) "Behaviours" have been renamed "Output formats".
+ b) "Formats" have been renamed "Format templates". They are now + written in HTML.
+ c) "User defined functions" have been dropped.
+ d) "Extraction rules" have been dropped.
+ e) "Link rules" have been dropped.
+ f) "File formats" have been dropped.
+ g) "Format elements" have been introduced. They are written in Python, + and can simulate c), d) and e).
+ h) Formats can be managed through web interface or through + human-readable config files.
+ i) Introduction of tools like validator and dependencies checker.
+ j) Better support for multi-language formatting.
+

+

+Some of the advantages are:
+
+ + Management of formats is much clearer and easier (less concepts, + more tools).
+ + Writing formats is easier to learn : less concepts + to learn, redesigned work-flow, use of existing well known and + well documented languages.
+ + Editing formats is easier: You can use your preferred HTML editor such as + Emacs, Dreamweaver or Frontpage to modify templates, or any text + editor for output formats and format elements. You can also use the + simplified web administration interface.
+ + Faster and more powerful templating system.
+ + Separation of business logic (output formats, format elements) + and presentation layer (format templates). This makes the management + of formats simpler.
+

+

+The disadvantages are:
+
+ - No backward compatibility with old formats.
+ - Stricter separation of business logic and presentation layer:
+ no more use of statements such as if(), forall() inside templates, + and this requires more work to put logic inside format elements.
+

+

Migrating behaviours to output formats

+

Behaviours were previously stored in the database and did require to use the evaluation language to +provide the logic that choose which format to use for a record. They also let you enrich records +with some custom data. Now their use has been simplified and rectricted to equivalence tests on the value of a field +of the record to define the format template to use.

+

For example, the following behaviour:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CONDITIONS
0$980.a="PICTURE"
Action (0)"<record> +
 <controlfield tag=\"001\">" $001 "</controlfield> +
 <datafield tag=\"FMT\" ind1=\"\" ind2=\"\">  + +
 <subfield code=\"f\">hb</subfield>  +
 <subfield code=\"g\">"  +
xml_text(format("PICTURE_HTML_BRIEF")) +
" </subfield>  +
 </datafield> +
</record>"
  + +
100""=""
Action (0)"<record> +
 <controlfield tag=\"001\">" $001 "</controlfield> + +
 <datafield tag=\"FMT\" ind1=\"\" ind2=\"\">  +
 <subfield code=\"f\">hb</subfield>  +
 <subfield code=\"g\">"  +
xml_text(format("DEFAULT_HTML_BRIEF")) +
" </subfield>  + +
 </datafield> +
</record>"
+   +
+

translates to the following output format (in textual configuration file):

+

+ +tag 980__a:
+PICTURE --- Picture_HTML_brief.bft
+default: Default_HTML_brief.bft
+

+

or visual representation through web interface:
+Image representation of HB output format +

+

We suggest that you use the migration kit to produce initial output formats from +your behaviours, but that you go through the created .bfo files in the /etc/bibformat/output_formats/ directory of your CDS Invenio installation to check that they correspond to your behaviours.

+

Migrating formats to format templates and format elements

+

The migration of formats is the most difficult part of the migration. You will need to separate the presentation code (HTML) from the business code (iterations, tests and calls to the database). Here are some tips on how you can do this:

+ +

We recommend that you do not use the migration kit for this part: it can help you create the initial files, but will never be able to provide a working implementation of the formats.

+

Migrating Knowledge Bases

+

We recommend you use the migration kit to migrate your knowledge bases. It should have no problem to migrate this part of your configuration.

+

Migrating UDFs and Link rules

+

User Defined Functions and Link rules have been dropped in the new BibFormat. These concepts have no reasons to be as they can be fully implemented in the format elements. For example the AUTHOR_SEARCH link rule can directly be implemented in the Authors.bfe element.

+

As for the UDFs, most of them are directly built-in functions of Python. Whenever a special function as to be implemented, it can be defined in a regular Python file and used in any element.

+

The Migration Kit

+

The migration kit is available from the main BibFormat admin webpage or +directly here. The migration +kit has 3 steps, each migrating some part of your configuration. Just +click on the links to migrate each part and get the status of the migration.

+

You should note that each migration will create new files or entries in the database, such that +you will certainly want to click only once on each step (otherwise you will get duplicates).

+

The migration kit can:
+ a) Effortlessly migrate your behaviours, unless they include complex + logic, which usually they don't.
+ b) Help you migrate formats to format templates and format elements.
+ c) Effortlessly migrate your knowledge bases.
+

+

Point b) is the most difficult to achieve: previous formats did mix +business logic and code for the presentation, and could use PHP +functions. The new BibFormat separates business logic and +presentation, and does not support PHP. The transition kit will try to +move business logic to the format elements, and the presentation to +the format templates. These files will be created for you, includes +the original code and, if possible, a proposal of Python +translation. We recommend that you do not to use the transition kit to +translate formats, especially if you have not modified default +formats, or only modified default formats in some limited places. You +will get cleaner code if you write format elements and format +templates yourself. +

+ +

Run old and new formats side by side

+

You might want to migrate your formats over a long period of time, making new formats available to your +users once they have been migrated, while old formats are still being used if they have not been translated. +BibFormat will do this almost automatically. This section tells you what you should be aware of if you want this to work seamlessly.

+

When BibFormat has to format a record with a given output format code, it first tries to find a corresponding +output format in the (new) output formats directory. If the output format cannot be found, it handle the formatting process +to the old BibFormat, which will look for a behaviour with a name corresponding to code. This leads to the first rule you should follow:

+

For each of the Behaviours you want to migrate, you should have an Output Format with a code corresponding to the name of the Behaviour.

+

The second (and last) rule is as simple as the first one. Imagine you have a Behaviour "HD" that you want to migrate to Output Format "HB". Let's say that "HD" links to 'picture_HTML_detailed' format if field 980__a is equal to "Picture", and links to 'default_HTML_detailed' in all other cases, but that 'picture_HTML_detailed' has not been migrated to a new format template. Then second rule says:

+

Output Formats should have the same conditions on tags as Behaviours, even if format for that condition has not been migrated.

+

In our example if you open the "HD" ouput format in the web interface, we can add a rule that works on condition "If 980__a is PICTURE" and set the template to be used to "defined in old BibFormat" in the template menu. This looks strange, this is the only way to tell BibFormat that it should consider this condition and not go to the default rule and use the default template.

+For developers and adventurers only: +

If you are to write Output Formats without the web interface, you should use the name migration_in_progress for each template which has not been migrated. The above example would therefore become:
+ tag 980__a :
+PICTURE --- migration_in_progress
+default: Default_HTML_detailed.bft
+

+

7.3 Integrating BibFormat into Dreamweaver MX

+

BibFormat templates have been thought to be editable in custom HTML editors. We propose in this section +a way to extend one particular editor, Dreamweaver.

+

Make Dreamweaver Recognize Format Elements in Layout View

+

To make Dreamweaver understand the format elements and display an icon for each of them in the layout editor, you must +edit a Dreamweaver configuration file named Tags.xml located inside /Configuration/ThirdPartyTags directory +of your Dreamweaver installation folder. At the end of this file, copy-paste the following lines: +

+  <!-- BibFormat (CDS Invenio) -->
+  <tagspec tag_name="BIBFORMAT" start_string="<BFE_" end_string="/>" parse_attributes="false" detect_in_attribute="true" icon="bibformat.gif" icon_width="25" icon_height="16"></tagspec >
+  <tagspec tag_name="BIBFORMAT" start_string="<bfe_" end_string="/>" parse_attributes="false" detect_in_attribute="true" icon="bibformat.gif" icon_width="25" icon_height="16"></tagspec >
+  
+ Also copy this icon bibformat.gif in the same directory as Tags.xml (right-click on icon, or ctrl-click on one-button mouse, and "Save Image As..."). Make sure the downloaded image is named "bibformat.gif". +

+

Note that Dreamweaver might not recognize Format Elements when complex formatting is involved due to these elements.

+

Add a Format Elements Floating Panel

+

You can add a floating panel that will you to insert Format Elements in your document and read the documentation + of all available Format Elements.

+

The first step is to declare in which menu of Dreamweaver this floating panel is going to be available. + To do so, edit file "Menu.xml" located inside /Configuration/Menus of your Dreamweaver + application directory and copy-paste the following line in the menu you want + (typically inside tag 'menu' with attribute id='DWMenu_Window_Others'):

+
+   <menuitem name="BibFormat Elements" enabled="true" command="dw.toggleFloater('BibFormat_floater.html')" checked="dw.getFloaterVisibility('BibFormat_floater.html')" />
+  
+

+

Once this is done, you can download the floating palette (if file opens in your browser instead of downloading, right-click on icon, or ctrl-click on one-button mouse, and "Save Target As...") and move the dowloaded file "BibFormat_floater.html" (do not rename it) into /Configuration/Floaters directory of your Dreamweaver application folder.

+

To use the BibFormat floating panel, open Dreamweaver, and choose Window > Others > BibFormat Elements.

+

Whenever a new version of the palette is available, you can skip the edition of file "Menu.xml" and just replace the old "BibFormat_floater" file with the new one.

+

7.4 FAQ

+ +

Why do we need output formats? Wouldn't format templates be sufficient?

+

As you potentially have a lot of records, it is not conceivable to specify for each of them which +format template they should use. This is why this rule-based decision layer has been introduced.

+ +

How can I protect a format?

+

As a web user, you cannot protect a format. If you are administrator of the +system and have access to the format files, you can simply use the permission rights of your system, as BibFormat +is aware of it.

+ +

Why cannot I edit/delete a format?

+

The format file has certainly been protected by the administrator of the server. You must ask the +administrator to unprotect the file if you want to edit it.

+ +

How can I add a format element from the web interface?

+

Format elements cannot be added, removed or edited through the web interface. This limitation +has been introduced to limit the security risks caused by the upload of Pythonic files on the server. The only possibility to add a basic format element from the web interface is to add a en entry in the "Logical Fields" management interface of the BibIndex module (see Add a Format Element)

+ +

Why are some Marc codes omitted in the "Check Dependencies" pages?

+

When you check the dependencies of a format, the page reminds you that +some use of Marc codes might not be indicated. This is because it is not +possible (or at least not trivial) to guess that the call to field(str(5+4)+"80"+"__a") +is equal to a call to field("980__a"). You should then not completely rely on this indication.

+ +

How are displayed deleted record?

+

By default, CDS Invenio displays a standard "The record has been deleted." message for all +output formats with a 'text/html' content type. Your output format, format templates and format elements +are bypassed by the engine. +However, for more advanced output formats, CDS Invenio +goes through the regular formatting process and let your formats do the job. This allows you to customize how a record should be displayed once it has been deleted.

+ +

Why are some format elements omitted in the "Knowledge Base Dependencies" page?

+

When you check the dependencies of a knowledge base, the page +reminds you that format elements using this knowledge base might not +be indicated. This is because it is not possible (or at least not +trivial) to guess that the call to +kb(e.upper()+"journal"+"s") in a format element is equal +to a call to kb("Ejournals"). You should then not +completely rely on this indication.

+ +

Why are some format elements defined in field table omitted in the format element documentation?

+ +

Some format elements defined in the "Logical Fields" management +interface of the BibIndex module (the basic format elements) are not +shown in the format elements documentation pages. We do not show such +an element if its name starts with a number. This is to reduce the +number of elements shown in the documentation as the logical fields +table contains a lot of not so useful fields to be used in +templates.

+ +

How can I get access to repeatable subfields from inside a format element?

+ +

Given that repeatable subfields are not frequent, the +bfo.fields(..) function has been implemented to return the most convenient structure for most cases, that is a 'list of strings' (Case 1 below) or 'list of dict of strings' (Case 2 below). For eg. with the following metadata: +

    999C5 $a value_1a $b value_1b
+    999C5 $b value_2b
+    999C5 $b value_3b $b value_3b_bis
+
+    >> bfo.fields('999C5b')                                   (1)
+    >> ['value_1b', 'value_2b', 'value_3b', 'value_3b_bis']
+    >> bfo.fields('999C5')                                    (2)
+    >> [{'a':'value_1a', 'b':'value_1b'},
+        {'b':'value_2b'},
+        {'b':'value_3b'}]
+
+ +In this example value3b_bis is not shown for +bfo.fields('999C5') (Case 2). If it were to be taken into account, the +returned structure would have to be a 'list of dict of list of strings', thus making for most cases +the access to the data a bit more complex.
+In order to consider the repeatable subfields, use the additional repeatable_subfields_p parameter: + +
    >> bfo.fields('999C5b', repeatable_subfields_p=True)      (1 bis)
+    >> ['value_1b', 'value_2b', 'value_3b']
+    >> bfo.fields('999C5', repeatable_subfields_p=True)       (2 bis)
+    >> [{'a':['value_1a'], 'b':['value_1b']},
+        {'b':['value_2b']},
+        {'b':['value_3b', 'value3b_bis']}]
+
+Another solution would be to access the BibRecord structure with +bfo.getRecord() and use the lower-level BibRecord module with this structure. +

+
diff --git a/modules/bibformat/doc/admin/bibformat-admin.webdoc b/modules/bibformat/doc/admin/bibformat-admin.webdoc new file mode 100644 index 000000000..f8ab40cf9 --- /dev/null +++ b/modules/bibformat/doc/admin/bibformat-admin.webdoc @@ -0,0 +1,46 @@ +## $Id$ + +## This file is part of CDS Invenio. +## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. +## +## CDS Invenio is free software; you can redistribute it and/or +## modify it under the terms of the GNU General Public License as +## published by the Free Software Foundation; either version 2 of the +## License, or (at your option) any later version. +## +## CDS Invenio is distributed in the hope that it will be useful, but +## WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +## General Public License for more details. +## +## You should have received a copy of the GNU General Public License +## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., +## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. + + + + + + +
+
Manage Format Templates
+
Define how to format a record.
+
+
+
Manage Output Formats
+
Define which template is applied to which record for a given output.
+
+
+
Manage Knowledge Bases
+
Define mappings of values, for standardizing records or declaring often used values.
+
+
+
+
Format Elements Documentation
+
Documentation of the format elements to be used inside format templates.
+
+
+
BibFormat Admin Guide
+
Documentation about BibFormat administration
+
+ diff --git a/modules/bibformat/doc/admin/bibformat-guide-bfe.gif b/modules/bibformat/doc/admin/bibformat-guide-bfe.gif new file mode 100644 index 000000000..6916362e3 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-bfe.gif differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-output_format_attributes_tutorial.png b/modules/bibformat/doc/admin/bibformat-guide-output_format_attributes_tutorial.png new file mode 100644 index 000000000..03d214116 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-output_format_attributes_tutorial.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-output_format_check_dependencies.png b/modules/bibformat/doc/admin/bibformat-guide-output_format_check_dependencies.png new file mode 100644 index 000000000..210747eec Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-output_format_check_dependencies.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-output_format_edit_rule.png b/modules/bibformat/doc/admin/bibformat-guide-output_format_edit_rule.png new file mode 100644 index 000000000..23c5626e2 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-output_format_edit_rule.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-output_format_edit_rule2.png b/modules/bibformat/doc/admin/bibformat-guide-output_format_edit_rule2.png new file mode 100644 index 000000000..266e48f01 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-output_format_edit_rule2.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-output_format_hb_migrate.png b/modules/bibformat/doc/admin/bibformat-guide-output_format_hb_migrate.png new file mode 100644 index 000000000..b055cf93b Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-output_format_hb_migrate.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-output_format_hd_rules.png b/modules/bibformat/doc/admin/bibformat-guide-output_format_hd_rules.png new file mode 100644 index 000000000..f67c9de95 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-output_format_hd_rules.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-output_formats_manage_tutorial.png b/modules/bibformat/doc/admin/bibformat-guide-output_formats_manage_tutorial.png new file mode 100644 index 000000000..edee79068 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-output_formats_manage_tutorial.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-rules_editor_tutorial.png b/modules/bibformat/doc/admin/bibformat-guide-rules_editor_tutorial.png new file mode 100644 index 000000000..08f187df2 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-rules_editor_tutorial.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-template_attributes_tutorial.png b/modules/bibformat/doc/admin/bibformat-guide-template_attributes_tutorial.png new file mode 100644 index 000000000..32a156ca2 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-template_attributes_tutorial.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-template_editor.png b/modules/bibformat/doc/admin/bibformat-guide-template_editor.png new file mode 100644 index 000000000..c34dc24aa Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-template_editor.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-template_editor_tutorial.png b/modules/bibformat/doc/admin/bibformat-guide-template_editor_tutorial.png new file mode 100644 index 000000000..cf399ec08 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-template_editor_tutorial.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-template_preview.png b/modules/bibformat/doc/admin/bibformat-guide-template_preview.png new file mode 100644 index 000000000..a613e37a3 Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-template_preview.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-templates_manage_tutorial.png b/modules/bibformat/doc/admin/bibformat-guide-templates_manage_tutorial.png new file mode 100644 index 000000000..0dca1484f Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-templates_manage_tutorial.png differ diff --git a/modules/bibformat/doc/admin/bibformat-guide-url_bar.png b/modules/bibformat/doc/admin/bibformat-guide-url_bar.png new file mode 100644 index 000000000..5ee2211cf Binary files /dev/null and b/modules/bibformat/doc/admin/bibformat-guide-url_bar.png differ diff --git a/modules/bibformat/doc/hacking/Makefile.am b/modules/bibformat/doc/hacking/Makefile.am index 218c59465..a1b143170 100644 --- a/modules/bibformat/doc/hacking/Makefile.am +++ b/modules/bibformat/doc/hacking/Makefile.am @@ -1,31 +1,40 @@ ## $Id$ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. +## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. docdir = $(localstatedir)/www/hacking/bibformat doc_DATA = index.html api.html +webdoclibdir = $(libdir)/webdoc/hacking + +webdoclib_DATA = \ + bibformat-internals.webdoc \ + bibformat-api.webdoc + FILESWML = $(wildcard $(srcdir)/*.wml) -EXTRA_DIST = $(FILESWML:$(srcdir)/%=%) +EXTRA_DIST = $(webdoclibdir_DATA) -CLEANFILES = $(doc_DATA) *~ *.tmp +CLEANFILES = *~ *.tmp %.html: %.html.wml $(top_srcdir)/config/config.wml $(top_builddir)/config/configbis.wml - $(WML) -o\(ALL-LANG_*\)+LANG_EN:$@ $< - $(PYTHON) $(top_srcdir)/po/i18n_update_wml_target.py en $@ \ No newline at end of file + for wml_file in $(FILESWML); do \ + $(PYTHON) $(top_srcdir)/modules/bibformat/lib/wml2html.py \ + -c $(top_srcdir)/config/config.wml -c $(top_builddir)/config/configbis.wml \ + -l en -i $${wml_file} ; \ + done \ No newline at end of file diff --git a/modules/bibformat/doc/hacking/bibformat-api.webdoc b/modules/bibformat/doc/hacking/bibformat-api.webdoc new file mode 100644 index 000000000..6b8d467f1 --- /dev/null +++ b/modules/bibformat/doc/hacking/bibformat-api.webdoc @@ -0,0 +1,867 @@ +## $Id$ + +## This file is part of CDS Invenio. +## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. +## +## CDS Invenio is free software; you can redistribute it and/or +## modify it under the terms of the GNU General Public License as +## published by the Free Software Foundation; either version 2 of the +## License, or (at your option) any later version. +## +## CDS Invenio is distributed in the hope that it will be useful, but +## WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +## General Public License for more details. +## +## You should have received a copy of the GNU General Public License +## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., +## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. + + + + + + + + +
+****************************************************************************
+** IMPORTANT NOTE: Note that this documentation is an updated version of  **
+** an earlier technical draft of BibFormat specifications. Please first   **
+** refer to the BibFormat admin guide.                                    **
+****************************************************************************
+
+Technical Overview of the new BibFormat
+=======================================
+
+Contents:
+1. Python API
+2. The philosophy behind BibFormat
+3. Differences between the old PHP version and the new Pythonic version
+4. Migrating from the previous PHP BibFormat version to the new Pythonic version
+5. Specifications of the new BibFormat configuration files.
+
+
+1. Python API
+
+The APIs of bibformat.py consists in these functions:
+
+ def format_record(recID, of, ln=cdslang, verbose=0,
+                   search_pattern=None, xml_record=None, uid=None,
+                   on_the_fly=False):
+     """
+     Formats a record given its ID (or its XML representation)
+     and an output format.
+
+     Returns a formatted version of the record in the specified
+     language, with pattern context, and specified output format.
+     The function will define by itself which format template must be
+     applied.
+
+     Parameters that allow contextual formatting (like 'search_pattern'
+     and 'uid') are useful only when doing on-the-fly formatting,
+     or when caching with care (e.g. caching all formatted
+     versions of a record for each possible 'ln').
+
+     The arguments are as follows:
+
+               recID -  the ID of the record to format. If ID does not exist
+                        the function returns empty string or an error
+                        string, depending on level of verbosity.
+                        If 'xml_record' parameter is specified, 'recID'
+                        is ignored
+
+                  of -  an output format code. If 'of' does not exist as code in
+                        output format, the function returns empty
+                        string or an error string, depending on level
+                        of verbosity.  ;of' is case insensitive.
+
+                  ln -  the language to use to format the record. If
+                        'ln' is an unknown language, or translation
+                        does not exist, default cdslang language
+                        will be applied whenever possible.
+                        Allows contextual formatting.
+
+             verbose -  the level of verbosity in case of errors/warnings
+                        0 - Silent mode
+                        5 - Prints only errors
+                        9 - Prints errors and warnings
+
+      search_pattern -  the pattern used as search query when asked to
+                        format this record (User request in web
+                        interface). Allows contextual formatting.
+
+          xml_record -  an XML string representation of the record to
+                        format.  If it is specified, recID parameter is
+                        ignored. The XML must be pasable by BibRecord.
+
+                 uid -  User ID of the user who will view the formatted
+                        record.  Useful to grant access to special
+                        functions on a page depending on user's
+                        priviledge.  Allows contextual formatting.
+                        Typically 'uid' is retrieved with webuser.getUid(req).
+
+          on_the_fly -  if False, try to return an already preformatted version
+                        of the record in the database.
+
+     """
+
+
+ Example:
+   >> from invenio.bibformat import format_record
+   >> format_record(5, "hb", "fr")
+
+
+ def format_records(recIDs, of, ln=cdslang, verbose=0, search_pattern=None,
+                    xml_records=None, uid=None, record_prefix=None,
+                    record_separator=None, record_suffix=None,
+                    prologue="", epilogue="", req=None, on_the_fly=False):
+     """
+     Returns a list of formatted records given by a list of record IDs or a
+     list of records as xml.
+     Adds a prefix before each record, a suffix after each record,
+     plus a separator between records.
+
+     Also add optional prologue and epilogue to the complete formatted list.
+
+     You can either specify a list of record IDs to format, or a list of
+     xml records, but not both (if both are specified recIDs is ignored).
+
+     'record_separator' is a function that returns a string as separator between
+     records. The function must take an integer as unique parameter,
+     which is the index in recIDs (or xml_records) of the record that has
+     just been formatted. For example separator(i) must return the separator
+     between recID[i] and recID[i+1]. Alternatively separator can be a single
+     string, which will be used to separate all formatted records.
+     The same applies to 'record_prefix' and 'record_suffix'.
+
+     'req' is an optional parameter on which the result of the function
+     are printed lively (prints records after records) if it is given.
+     Note that you should set 'req' content-type by yourself, and send
+     http header before calling this function as it will not do it.
+
+     This function takes the same parameters as 'format_record' except for:
+
+               recIDs -  a list of record IDs to format
+
+          xml_records -  a list of xml string representions of the records to
+                         format. If this list is specified, 'recIDs' is ignored.
+
+        record_prefix - a string or a function the takes the index of the record
+                        in 'recIDs' or 'xml_records' for which the function must
+                        return a string.
+                        Printed before each formatted record.
+
+     record_separator - either a string or a function that returns string to
+                        separate formatted records. The function takes the index
+                        of the record in 'recIDs' or 'xml_records' that is being
+                        formatted.
+
+        record_prefix - a string or a function the takes the index of the record
+                        in 'recIDs' or 'xml_records' for which the function must
+                        return a string.
+                        Printed after each formatted record
+
+                  req - an optional request object on which formatted records
+                        can be printed (for "live" output )
+
+             prologue - a string printed before all formatted records string
+
+             epilogue - a string printed after all formatted records string
+
+           on_the_fly - if False, try to return an already preformatted version
+                        of the records in the database
+     """
+
+
+ def get_output_format_content_type(of):
+     """
+     Returns the content type (eg. 'text/html' or 'application/ms-excel') \
+     of the given output format.
+
+     The function takes this mandatory parameter:
+
+     of - the code of output format for which we want to get the content type
+     """
+
+
+ def record_get_xml(recID, format='xm', decompress=zlib.decompress):
+     """
+     Returns an XML string of the record given by recID.
+
+     The function builds the XML directly from the database,
+     without using the standard formatting process.
+
+     'format' allows to define the flavour of XML:
+        - 'xm' for standard XML
+        - 'marcxml' for MARC XML
+        - 'oai_dc' for OAI Dublin Core
+        - 'xd' for XML Dublin Core
+
+     If record does not exist, returns empty string.
+
+     The function takes the following parameters:
+
+          recID - the id of the record to retrieve
+
+         format - the XML flavor in which we want to get the record
+
+     decompress _ a function used to decompress the record from the database
+    """
+
+The API of the BibFormat Object ('bfo') given as a parameter to
+format function of format elements consist in the following
+functions. This API is to be used only inside format elements.
+
+ def control_field(self, tag, escape='0'):
+    """
+    Returns the value of control field given by tag in record.
+
+    If the value does not exist, returns empty string
+    The returned value is always a string.
+
+    'escape' parameter allows to escape special characters
+    of the field. The value of escape can be:
+          0 - no escaping
+          1 - escape all HTML characters
+          2 - escape all HTML characters by default. If field starts with ,
+              escape only unsafe characters, but leave basics HTML tags.
+              This is particularly useful if you want to store HTML text in your
+              metadata but still want to escape some tags to prevent
+              XSS vulnerabilities. Note that this method is slower than
+              basic escaping of mode 1.
+
+    The arguments are:
+
+         tag    -  the marc code of a field
+         escape -  1 if returned value should be escaped. Else 0.
+                   (see above for other modes)
+    """
+
+ def field(self, tag, escape='0'):
+    """
+    Returns the value of the field corresponding to tag in the
+    current record.
+
+    If the value does not exist, returns empty string
+    The returned value is always a string.
+
+    'escape' parameter allows to escape special characters
+    of the field. The value of escape can be:
+          0 - no escaping
+          1 - escape all HTML characters
+          2 - escape all HTML characters by default. If field starts with ,
+              escape only unsafe characters, but leaves basic HTML tags.
+              This is particularly useful if you want to store HTML text in your
+              metadata but still want to escape some tags to prevent
+              XSS vulnerabilities. Note that this method is slower than
+              basic escaping of mode 1.
+
+    The arguments are:
+
+         tag  -  the marc code of a field
+         escape -  1 if returned value should be escaped. Else 0.
+                   (see above for other modes)
+    """
+
+
+ def fields(self, tag, escape='0', repeatable_subfields_p=False):
+    """
+    Returns the list of values corresonding to "tag".
+
+    If tag has an undefined subcode (such as 999C5),
+    the function returns a list of dictionaries, whoose keys
+    are the subcodes and the values are the values of tag.subcode.
+    If the tag has a subcode, simply returns list of values
+    corresponding to tag.
+    Eg. for given MARC:
+        999C5 $a value_1a $b value_1b
+        999C5 $b value_2b
+        999C5 $b value_3b $b value_3b_bis
+
+        >> bfo.fields('999C5b')
+        >> ['value_1b', 'value_2b', 'value_3b', 'value_3b_bis']
+        >> bfo.fields('999C5')
+        >> [{'a':'value_1a', 'b':'value_1b'},
+            {'b':'value_2b'},
+            {'b':'value_3b'}]
+
+    By default the function returns only one value for each
+    subfield (that is it considers that repeatable subfields are
+    not allowed). It is why in the above example 'value3b_bis' is
+    not shown for bfo.fields('999C5').  (Note that it is not
+    defined which of value_3b or value_3b_bis is returned).  This
+    is to simplify the use of the function, as most of the time
+    subfields are not repeatable (in that way we get a string
+    instead of a list).  You can allow repeatable subfields by
+    setting 'repeatable_subfields_p' parameter to True. In
+    this mode, the above example would return:
+        >> bfo.fields('999C5b', repeatable_subfields_p=True)
+        >> ['value_1b', 'value_2b', 'value_3b']
+        >> bfo.fields('999C5', repeatable_subfields_p=True)
+        >> [{'a':['value_1a'], 'b':['value_1b']},
+            {'b':['value_2b']},
+            {'b':['value_3b', 'value3b_bis']}]
+    NOTICE THAT THE RETURNED STRUCTURE IS DIFFERENT.  Also note
+    that whatever the value of 'repeatable_subfields_p' is,
+    bfo.fields('999C5b') always show all fields, even repeatable
+    ones. This is because the parameter has no impact on the
+    returned structure (it is always a list).
+
+    'escape' parameter allows to escape special characters
+    of the field. The value of escape can be:
+          0 - no escaping
+          1 - escape all HTML characters
+          2 - escape all HTML characters by default. If field starts with ,
+              escape only unsafe characters, but leaves basic HTML tags.
+              This is particularly useful if you want to store HTML text in your
+              metadata but still want to escape some tags to prevent
+              XSS vulnerabilities. Note that this method is slower than
+              basic escaping of mode 1.
+
+    The arguments are:
+
+          tag  -  the marc code of a field
+          escape -  1 if returned value should be escaped. Else 0.
+                   (see above for other modes)
+    """
+
+ def kb(self, kb, string, default=""):
+    """
+    Returns the value of the "string" in the knowledge base "kb".
+
+    If kb does not exist or string does not exist in kb,
+    returns 'default' string or empty string if not specified
+
+    The arguments are as follows:
+
+          kb  -  the knowledge base name in which we want to find the mapping.
+                 If it does not exist the function returns the original
+                 'string' parameter value. The name is case insensitive (Uses
+                 the SQL 'LIKE' syntax to retrieve value).
+
+      string  -  the value for which we want to find a translation-
+                 If it does not exist the function returns 'default' string.
+                 The string is case insensitive (Uses the SQL 'LIKE' syntax
+                 to retrieve value).
+
+     default  -  a default value returned if 'string' not found in 'kb'.
+
+    """
+
+ def get_record(self):
+    """
+    Returns the record encapsulated in bfo as a BibRecord structure.
+    You can get full access to the record through bibrecord.py functions.
+    """
+
+  Example (from inside BibFormat element):
+  >> bfo.field("520.a")
+  >> 'We present a quantitative appraisal of the physics potential
+      for neutrino experiments.'
+  >>
+  >> bfo.control_field("001")
+  >> '12'
+  >>
+  >> bfo.fields("700.a")
+  >>['Alekhin, S I', 'Anselmino, M', 'Ball, R D', 'Boglione, M']
+  >>
+  >> bfo.kb("DBCOLLID2COLL", "ARTICLE")
+  >> 'Published Article'
+  >>
+  >> bfo.kb("DBCOLLID2COLL", "not in kb", "My Value")
+  >> 'My Value'
+
+Moreover you can have access to the language requested for the
+formatting, the search pattern used by the user in the web
+interface and the userID by directly getting the attribute from 'bfo':
+
+    bfo.lang
+    """
+    Returns the language that was asked to be used for the
+    formatting. Always returns a string.
+    """
+
+    bfo.search_pattern
+    """
+    Returns the search pattern specified by the user when
+    the record had to be formatted. Always returns a string.
+    """
+
+    bfo.uid
+    """
+    Returns the user ID of the user who shall view the formatted
+    record.
+    """
+
+    bfo.recID
+    """
+    Returns the id of the record
+    """
+
+    bfo.req
+    """
+    Returns the mod_python request object
+    """
+
+    bfo.format
+    """
+    Returns the format in which the record is being formatted
+    """
+
+  Example (from inside BibFormat element):
+  >> bfo.lang
+  >> 'en'
+  >>
+  >> bfo.search_pattern
+  >> 'mangano and neutrino and factory'
+
+
+2. The philosophy behind BibFormat
+
+BibFormat is in charge of formatting the bibliographic records that
+are displayed to your users. As you potentially have a huge amount of
+bibliographic records, you cannot specify manually for each of them
+how it should be formatted. This is why you can define rules that will
+allow BibFormat to understand which kind of formatting to apply to a given
+record. You define this set of rules in what is called an "output
+format".
+
+You can have different output formats, each with its own characteristics.
+For example you certainly want that when multiple bibliographic records are
+displayed at the same time (as it happens in search results), only
+short versions are shown to the user, while a detailed record is
+preferable when a single record is displayed. You might also want to
+let your users decide which kind of output they want. For example you
+might need to display HTML for regular web browsing, but would also
+give a BibTeX version of the bibliographic reference for direct
+inclusion in a LaTeX document.
+See section 5.1 to learn how to create or modify output formats.
+
+While output formats define what kind of formatting must be applied,
+they do not define HOW the formatting is done. This is the role of the
+"format templates", which define the layout and look of a
+bibliographic reference. These format templates are rather easy to
+write if you know a little bit of HTML (see section 5.2 "Format
+templates specifications"). You will certainly have to create
+different format templates, for different kinds of records. For
+example you might want records that contain pictures to display them,
+maybe with captions, while records that do not have pictures limited
+to printing a title and an abstract.
+
+In summary, you have different output formats (like 'brief HTML',
+'detailed HTML' or 'BibTeX') that call different format templates
+according to some criteria.
+
+There is still one kind of configuration file that we have not talked
+about: the "format elements". These are the "bricks" that you use in
+format templates, to get the values of a record. You will learn to use
+them in your format template in section 5.2 "Format templates
+specifications", but you will mostly not need to modify them or create
+new ones. However if you do need to edit one, read section 5.3 "Format
+elements specifications" (And if you know Python it will be easy, as
+they are written in Python).
+
+Finally BibFormat can make use of mapping tables called "knowledge
+bases". Their primary use is to act like a translation table, to
+normalize records before displaying them. For example, you can say
+that records that have value "Phys Rev D" or "Physical Review D" for
+field "published in" must display "Phys Rev : D." to users. See
+section 5.4 to learn how to edit knowledge bases.
+
+In summary, there are three layers.  Output formats:
+
++-----------------------------------------------------+
+|                    Output Format                    | (Layer 1)
+|                    eg: HTML_Brief.bfo               |
++-----------------------------------------------------+
+
+call one of several `format templates':
+
++-------------------------+ +-------------------------+
+|     Format Template     | |     Format Template     | (Layer 2)
+|     eg: preprint.bft    | |     eg: default.bft     |
++-------------------------+ +-------------------------+
+
+that use one or several format elements:
+
++--------------+ +----------------+ +-----------------+
+|Format Element| |Format Element  | | Format Element  | (Layer 3)
+|eg: authors.py| |eg: abstract.py | | eg: title.py    |
++--------------+ +----------------+ +-----------------+
+
+
+3. Differences between the old PHP version and the new Pythonic version
+
+The most noticeable differences are:
+
+ a) "Behaviours" have been renamed "Output formats".
+ b) "Formats" have been renamed "Format templates". They are now
+     written in HTML.
+ c) "User defined functions" have been dropped.
+ d) "Extraction rules" have been dropped.
+ e) "Link rules" have been dropped.
+ f) "File formats" have been dropped.
+ g) "Format elements" have been introduced. They are written in Python,
+     and can simulate c), d) and e).
+ h)  Formats can be managed through web interface or through
+     human-readable config files.
+ i)  Introduction of tools like validator and dependencies checker.
+ j)  Better support for multi-language formatting.
+
+Some of the advantages are:
+
+ + Management of formats is much clearer and easier (less concepts,
+   more tools).
+ + Writing formats is easier to learn : less concepts
+   to learn, redesigned work-flow, use of existing well known and
+   well documented languages.
+ + Editing formats is easier: You can use your preferred HTML editor such as
+   Emacs, Dreamweaver or Frontpage to modify templates, or any text
+   editor for output formats and format elements. You can also use the
+   simplified web administration interface.
+ + Faster and more powerful templating system.
+ + Separation of business logic (output formats, format elements)
+   and presentation layer (format templates). This makes the management
+   of formats simpler.
+
+The disadvantages are:
+
+ - No backward compatibility with old formats.
+ - Stricter separation of business logic and presentation layer:
+   no more use of statements such as if(), forall() inside templates,
+   and this requires more work to put logic inside format elements.
+
+
+4. Migrating from the previous PHP BibFormat version to the new Pythonic version
+
+Old BibFormat formats are no longer compatible with the new BibFormat
+files. If you have not modified the "Formats" or modified only a
+little bit the "Behaviours", then the transition will be painless and
+automatic. Otherwise you will have to manually rewrite some of the
+formats. This should however not be a big problem. Firstly because the
+CDS Invenio installation will provide both versions of BibFormat for
+some time. Secondly because both BibFormat versions can run side by
+side, so that you can migrate your formats while your server still
+works with the old formats.  Thirdly because we provide a migration
+kit that can help you go through this process. Finally because the
+migration is not so difficult, and because it will be much easier for
+you to customize how BibFormat formats your bibliographic data.
+
+Concerning the migration kit it can:
+ a) Effortlessly migrate your behaviours, unless they include complex
+    logic, which usually they don't.
+ b) Help you migrate formats to format templates and format elements.
+ c) Effortlessly migrate your knowledge bases.
+
+Point b) is the most difficult to achieve: previous formats did mix
+business logic and code for the presentation, and could use PHP
+functions. The new BibFormat separates business logic and
+presentation, and does not support PHP. The transition kit will try to
+move business logic to the format elements, and the presentation to
+the format templates. These files will be created for you, includes
+the original code and, if possible, a proposal of Python
+translation. We recommend that you do not to use the transition kit to
+translate formats, especially if you have not modified default
+formats, or only modified default formats in some limited places. You
+will get cleaner code if you write format elements and format
+templates yourself.
+
+
+5. Specifications of the new BibFormat configuration files.
+
+   BibFormat uses human readable configuration files. However (apart
+   from format elements) these files can be edited and managed through
+   a web interface.
+
+5.1 Output formats specifications
+
+Output formats specify rules that define which format template
+to use to format a record.
+While the syntax of output formats is basic, we recommend that you use
+the web interface do edit them, to be sure that you make no error.
+
+The syntax of output format is the following one. First you
+define which field code you put as the conditon for the rule.
+You suffix it with a column. Then on next lines, define the values of
+the condition, followed by --- and then the filename of the template
+to use:
+
+  tag 980.a:
+  PICTURE --- PICTURE_HTML_BRIEF.bft
+  PREPRINT --- PREPRINT_HTML_BRIEF.bft
+  PUBLICATION --- PUBLICATION_HTML_BRIEF.bft
+
+This means that if value of field 980.a is equal to PICTURE, then we
+will use format template PICTURE_HTML_BRIEF.bft. Note that you must
+use the filename of the template, not the name. Also note that spaces
+at the end or beginning are not considered. On the following lines,
+you can either put other conditions on tag 980.a, or add another tag on
+which you want to put conditions.
+
+At the end you can add a default condition:
+
+   default: PREPRINT_HTML_BRIEF.bft
+
+which means that if no condition is matched, a format suitable for
+Preprints will be used to format the current record.
+
+The output format file could then look like this:
+
+  tag 980.a:
+  PICTURE --- PICTURE_HTML_BRIEF.bft
+  PREPRINT --- PREPRINT_HTML_BRIEF.bft
+  PUBLICATION --- PUBLICATION_HTML_BRIEF.bft
+
+  tag 8560.f:
+  .*@cern.ch --- SPECIAL_MEMBER_FORMATTING.bft
+
+  default: PREPRINT_HTML_BRIEF.bft
+
+You can add as many rules as you want. Keep in mind that they are read
+in the order they are defined, and that only first rule that
+matches will be used.
+Notice the condition on tag 8560.f: it uses a regular expression to
+match any email address that ends with @cern.ch (the regular
+expression must be understandable by Python)
+
+Some other considerations on the management of output formats:
+- Format outputs must be placed inside directory
+  /etc/bibformat/outputs/ of your CDS Invenio installation.
+- Note that as long as you have not provided a name to an output
+  THROUGH the web interface, it will not be available as a choice
+  for your users in some parts of CDS Invenio.
+- You should remove output formats THROUGH the web interface.
+- The format extension of output format is .bfo
+
+
+5.2 Format templates specifications
+
+Format templates are written in HTML-like syntax. You can use the
+standard HTML and CSS markup languague to do the formatting. The best
+thing to do is to create a page in your favourite editor, and once you
+are glad with it, add the dynamic part of the page, that is print the
+fields of the records. Let's say you have defined this page:
+
+  <h1>Some title</h1>
+  <p><i>Abstract: </i>Some abstract</p>
+
+Then you want that instead of "Some title" and "Some abstract", the
+value of the current record that is being displayed is used. To do so,
+you must use a format element brick. Either you know the name of the
+brick by heart, or you look for it in the elements documentation (see
+section 5.3). For example you would find there that you can print the
+title of the record by writting the HTML tag <BFE_TITLE /> in your
+format template, with parameter 'default' for a default value.
+
+  <h1><BFE_TITLE default="No Title"/></h1>
+  <p><BFE_ABSTRACT limit="1" prefix="<i>Abstract: </i>"
+  default="No abstract"/></p>
+
+Notice that <BFE_ABSTRACT /> has a parameter "limit" that <BFE_title/>
+ had not ("limit" allows to limit the number of sentences of the
+abstract, according to the documentation). Note that while format
+elements might have different parameters, they always can take the the
+three following ones: "prefix" and "suffix", whose values are printed
+only if the element is not empty, and "default", which is printed only
+if element is an empty string. We have used "prefix" for the abstract,
+so that the label "<i>Abstract: </i>" is only printed if the record
+has an abstract.
+
+You should also provide these tags in all of your templates:
+ -<name>a name for this template in the admin web interface</name>
+ -<description>a description to be used in admin web interface for
+  this template</description>
+
+Another feature of the templates is the support for multi-languages
+outputs. You can include <lang> tags, which contain tags labeled with
+the names of the languages supported in CDS Invenio. For example, one
+might write:
+
+  <lang><en>A record:</en><fr>Une notice:</fr></lang>
+  <h1><BFE_TITLE default="No Title"/></h1>
+  <p><BFE_ABSTRACT limit="1" prefix="<i>Abstract: </i>"
+  default="No abstract"/></p>
+
+When doing this you should at least make sure that the default
+language of your server installation is available in each <lang>
+tag. It is the one that is used if the requested language to display
+the record is not available. Note that we could also provide a
+translation in a similar way for the "No Title" default value inside
+<BFE_Title /> tag.
+
+Some other considerations on the use of elements inside templates:
+ -Format elements names are not case sensitive
+ -Format elements names always start with <BFE_
+ -Format elements parameters can contain '<' characters,
+  and quotes different from the kind that delimit parameters (you can
+  for example have <BFE_Title default='<a href="#">No Title</a>'/> )
+ -Format templates must be placed inside the directory
+  /etc/bibformat/templates/ of your CDS Invenio installation
+ -The format extension of a template is .bft
+
+Trick: you can use the <BFE_FIELD tag="245__a" /> to print the value
+of any field 245 $a in your templates.  This practice is however not
+recommended because it would necessitate to revise all format
+templates if you change meaning of the MARC code schema.
+
+5.3 Format elements specifications
+
+Format elements are the bricks used in format templates to provide the
+dynamic contents inside format templates.
+
+For the most basic format elements, you do not even need to write
+them: as long as you define `tag names' for MARC tags in the BibIndex
+Admin's Manage logical fields interface (database table tag),
+BibFormat knows which field must be printed when <BFE_tag_name/> is
+used inside a template.
+
+However for more complex processing, you will need to write a format
+element. A format element is written in Python. Therefore its file
+extension is ".py". The name you choose for the file is the one that
+will be used inside format template to call the element, so choose it
+carefully such that it is not too long, but self explanatory (you can
+prefix the filename with BFE or not, but the element will always be called
+with prefix <BFE_ inside templates).  Then you just need to drop the
+file in the /lib/python/invenio/bibformat_elements/ directory
+of your CDS Invenio installation. Inside your file you have to define a
+function named "format", which takes at least a "bfo" parameter (bfo
+for BibFormat Object). The function must return a string:
+
+  def format(bfo):
+      out = ""
+
+      return out
+
+You can have as many parameters as you want, as long as you make sure
+that parameter bfo is here. Let's see how to define an element that
+will print a brief title. It will take a parameter 'limit' that will
+limit the number of characters printed. We can provide some
+documentation for the elemen in the docstring of the
+function.
+
+ def format(bfo, limit="10"):
+      """
+      Prints a short title
+
+      @param limit a limit for the number of printed characters
+      """
+
+      out = ""
+
+      return out
+
+Note that we put a default value of 10 in the 'limit' parameter.  To
+get some value of a field, we must request the 'bfo' object. For
+example we can get the value of field 245.a (field "title"):
+
+ def format(bfo, limit="10"):
+      """
+      Prints a short title
+
+      @param limit a limit for the number of printed characters
+      """
+
+      title = bfo.field('245.a')
+
+      limit = int(limit)
+      if limit > len(title):
+          limit = len(title)
+
+      return title[:limit]
+
+As format elements are written in Python, we have decided not to give
+permission to edit elements through the web interface. Firstly for
+security reasons. Secondly because Python requires correct indentation,
+which is difficult to achieve through a web interface.
+
+You can have access to the documentation of your element through a web
+interface. This is very useful when you are writing a format template,
+to see which elements are available, what they do, which parameters they
+take, what are the default values of parameters, etc. The
+documentation is automatically extracted from format elements.
+Here follows an sample documentation generated for the element
+<BFE_TITLE />:
+
++--------------------------------------------------------------------------------------------+
+|  TITLE                                                                                     |
+|  -----                                                                                     |
+|  <BFE_TITLE separator="..." prefix="..." suffix="..." default="..." />                  |
+|                                                                                            |
+|      Prints the title of a record.                                                         |
+|                                                                                            |
+|      Parameters:                                                                           |
+|            separator - separator between the different titles.                             |
+|            prefix - A prefix printed only if the record has a value for this element.      |
+|            suffix - A suffix printed only if the record has a value for this element.      |
+|            default - A default value printed if the record has no value for this element.  |
+|                                                                                            |
+|       See also:                                                                            |
+|            Format templates that use this element                                          |
+|            The Python code of this element                                                 |
++--------------------------------------------------------------------------------------------+
+
+The more you provide documentation in the docstring of your elements,
+the easier it will be to write format template afterwards.
+
+Some remarks concerning format elements:
+ -parameters are always string values
+ -if no value is given as parameter in format the template, then the
+  value of parameter is "" (emtpy string)
+ -the docstring should contain a description, followed by
+  "@param parameter some description for parameter" for each
+  parameter (to give description for each parameter
+  in element documentation), and @see an_element.py, another_element.py
+  (to link to other elements in the documentation). Similar to JavaDoc.
+ -the following names cannot be used as parameters:
+  "default", "prefix", "suffix" and escape. They can however always be
+  used in the format template for any element.
+
+Another important remark concerns the 'escaping' of output of format
+elements. In most cases, format elements output is to be used for
+HTML/XML. Therefore special characters such as < or & have to be
+'escaped', replaced by '<' and '&'. This is why all outputs
+produced by format elements are automatically escaped by BibFormat,
+unless specified otherwise.  This means that you do not have to care
+about meta-data that would break your HTML displaying or XML export
+such as a physics formula like 'a < b'. Please also note that value
+given in 'prefix', 'suffix' and 'default' parameters are not escaped,
+such that you can safely use HTML tags for these.
+
+There are always cases where the default 'escaping' behaviour of
+BibFormat is not desired. For example when you explicitely output HTML
+text, like links: you do not want to see them escaped.  The first way
+to avoid this is to modify the call to your format element in the
+format template, by setting the default 'escape' parameter to 0:
+
+ 
+
+This is however inconvenient as you have to possibly need to modify a
+lot of templates. The other way of doing is to add another function to
+your format element, named 'escape':
+
+ def escape_values(bfo):
+     """
+     Called by BibFormat in order to check if output of this element
+     should be escaped.
+     """
+     return 0
+
+In that way all calls to your format element will produce unescaped
+output.  You will have to take care of escaping values "manually" in
+your format element code, in order to avoid non valid outputs or XSS
+vulnerabilities. There are methods to ease the escaping in your code
+described in section 1.
+Please also note that if you use this method, your element can still
+be escaped if a call to your element from a format template
+explicitely specifies to escape value using parameter 'escape'.
+
+
+5.4 Knowledge bases specifications
+
+Knowledge bases cannot be managed through configuration files.
+You can very easily add new bases and mappings using the given web GUI.
+
+ -- End of file --
+
+
diff --git a/modules/bibformat/doc/hacking/Makefile.am b/modules/bibformat/doc/hacking/bibformat-internals.webdoc similarity index 58% copy from modules/bibformat/doc/hacking/Makefile.am copy to modules/bibformat/doc/hacking/bibformat-internals.webdoc index 218c59465..d0e59069b 100644 --- a/modules/bibformat/doc/hacking/Makefile.am +++ b/modules/bibformat/doc/hacking/bibformat-internals.webdoc @@ -1,31 +1,36 @@ ## $Id$ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. +## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. -docdir = $(localstatedir)/www/hacking/bibformat + + + + -doc_DATA = index.html api.html -FILESWML = $(wildcard $(srcdir)/*.wml) -EXTRA_DIST = $(FILESWML:$(srcdir)/%=%) +This page summarizes all the information suitable to dig inside +BibFormat internals. -CLEANFILES = $(doc_DATA) *~ *.tmp +
+
-%.html: %.html.wml $(top_srcdir)/config/config.wml $(top_builddir)/config/configbis.wml - $(WML) -o\(ALL-LANG_*\)+LANG_EN:$@ $< - $(PYTHON) $(top_srcdir)/po/i18n_update_wml_target.py en $@ \ No newline at end of file +
BibFormat API
Explains how to call +the formatting engine from your Python programs, should a need be. + +
+