diff --git a/modules/bibformat/doc/admin/guide.html.wml b/modules/bibformat/doc/admin/guide.html.wml index c626cd637..3883e5348 100644 --- a/modules/bibformat/doc/admin/guide.html.wml +++ b/modules/bibformat/doc/admin/guide.html.wml @@ -1,2539 +1,2558 @@ ## $Id$ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. #include "cdspage.wml" \ title="BibFormat Admin Guide" \ navtrail_previous_links="/admin/>_(Admin Area)_ > /admin/bibformat/>BibFormat Admin" \ navbar_name="admin" \ navbar_select="bibformat-admin-guide"

Version <: print generate_pretty_revision_date_string('$Id$'); :>

Please note that the old PHP BibFormat administration guide can be found further below.

A Five minutes Introduction to BibFormat
Output Formats
Format Templates
Format Elements
Knowledge Bases
MARC Notation in Formats
Migrating from Previous BibFormat
FAQ

A Five Minutes Introduction to BibFormat

How BibFormat Works

BibFormat is in charge of formatting the bibliographic records that are displayed to your users. It is called by the search engine when it has to format a record.

As you might need different kind of formatting depending on the type of record, but potentially have a huge amount of records in your database, you cannot specify for each of them how they should look. Instead BibFormat uses a rule-based decision process to decide how to format a record.
The best way to understand how BibFormat works is to have a look at a typical workflow:

Step 1:
	When CDS Invenio has to display a record, it asks BibFormat to format the record with the given output format and language. For example here the requested output format is hd, which is a short code for "HTML Detailed". This means that somehow a user arrived on the page of the record and asked for a detailed view of the record.
Step 2:
	Beside is a screenshot of the "hd" or "HTML Detailed" output format. You can see that the output format does not specify how to format the record, but contains a set of rules which define which template must be used. The rules are evaluated from top to bottom. Each rule defines a condition on a field of the record, and a format template to use to format the record if the condition matches. Let's say that the field 980.a of the record is equal to "Picture". Then first rules matches, and format template Picture HTML Detailed is used for formatting by BibFormat. You can add, remove or edit output formats here
Step 3:
`<h1 align="center"><BFE_MAIN_TITLE/></h1> <p align="center"> <BFE_AUTHORS separator="; " link="yes"/><br/> <BFE_DATE format="%d %B %Y"> .- <BFE_NB_PAGES suffix="p"> </p>`	We see an extract of the Picture HTML Detailed format on the right, as it is shown in the template editor. As you can see it is mainly written using HTML. There are however some tags that are not part of standard HTML. Those tags that starts with <BFE_ are placeholders for the record values. For example <BFE_MAIN_TITLE/> tells BibFormat to write the title of the record. We call these tags "elements". Some elements have parameters. This is the case of the <BFE_AUTHORS> element, which can take separator and link as parameters. The value of separator will be used to separate authors' names and the link parameter tells if links to authors' websites have to be created. All elements are described in the elements documentation. You can add, remove or edit format templates here
Step 4:
`def format(bfo, separator='; ', link='no'): """ Prints the list of authors for the record @param separator a character to separate the authors @param link if 'yes' print HTML links to authors """ authors = bfo.fields("100__a") if link == 'yes': authors = map(lambda x: '<a href="'+weburl+'/search?f=author&p='\ + quote(x) +'">'+x+'</a>', authors) return authors.split(separator)`	A format element is written in Python. It acts as a bridge between the record in the database and the format template. Typically you will not have to write or read format elements, juste call them from the templates. Each element outputs some text that is written in the template where it is called. Developers can add new elements by creating a new file, naming it with the name of element, and write a Python `format` function that takes as parameters the parameters of the elements plus a special one `bfo`. Regular Python code can be used, including import of other modules.

In summary BibFormat is called by specifying a record and an output format, which relies on different templates to do the formatting, and which themselves rely on different format elements. Only developers need to modify the format elements layer.

Output Format

Template

Template

Format Element

Format Element

Format Element

Format Element

You should now understand the philosophy behind BibFormat.

Short Tutorial

Let's try to create our own format. This format will just print the title of a record.

First go to the main BibFormat admin page. Then click on the "Manage Ouput Format" links. You will see the list of all output formats:

This is were you can delete, create or check output formats. The menu at the top of the page let you go to other admininistration pages.
Click on the "Add New Output Format" button at the bottom of the page. You can then fill in some attributes for the output format. Choose "title" as code, "Only Title" as name and "Prints only title" as description:

Screenshot of the Update Output Format Attributes page

Leave other fields blank, and click on the button "Update Output format Attributes".
You are then redirected to the rules editor. Notice the menu at the top which let you close the editor, change the attributes again and check the output format. However do not click on these links before saving your modification of rules!

As our format does not need to have a different behaviour depending on the record, we do not need to add new rules to the format. You just need to select a format template in the "By default use" list. However we first have to create our special format template that only print titles. So close the editor using the menu at the top of the page, and in the menu that just appeared instead, click on "Manage Format Templates". In a similar way to output formats, you see the list of format templates.

Click on the "Add New Format Template" button at the bottom of the page. As for the output format, fill in the attributes of the template with name "Title" and any relevant description.

Click on the "Update Output Format Attributes" button. You are redirected to the template editor. The editor is divided in three parts. The upper left part contains the code of the template. The bottom part is a preview of the template. The part on the right side is a short remainder of the format elements you can use in you template. You can hide this documentation by clicking on "Hide Documentation".

The above screenshot shows the template code already filled in. It calls the BFE_TITLE element. If you do not know the name of the element you want to call, you can search for it using the embedded documentation search. You can try to add other elements into your template, or write some HTML formatting.

When you are satisfied with your template, click on the save button, close the editor and go back to the "Only titles" output format rules editor. There select the template you have just created in the "Use by default" menu and save the ouput format and you are done.

This tutorial does not cover all aspects of the management of formats (For example "Knowledge bases" or internationalization). It also does not show all the power of output formats, as the one we have created simply call a template. However you have seen enough to configure BibFormat trough the web interface. Read the sections below to learn more about it.

Administer Through the Web Interface or Through the Configuration files

BibFormat can be administered in two ways. The first way is to use the provided web interface. It should be the most convenient way of doing for most users. The web interface is simple to use and provides great tools to manage your formats. Its only limitation concerns the format elements, which cannot be modified using it (But the web interface provide a dynamically generated documentation of your elements).
The other way to administer BibFormat is to directly modify the configuration files using your preferred text editor. This way of doing can bring much power to advanced users, but requires an access to the server's files. It also requires that the user double-check his modifications, or use the web interface to ensure the validity and correctness of his formats.

In this manual we will show both ways of doing. For each explication we show first how to do it through the web interface, then how to do it by manipulating the configuration files. Non-power users can stop reading as soon as they encounter the text "For developers and adventurers only".

We generally recommend to use the web interface, excepted for writing format elements.

Output Formats

As you potentially have a huge amount of bibliographic records, you cannot specify manually for each of them how it should be formatted. This is why you can define rules that will allow BibFormat to understand which kind of formatting to apply to a given record. You define this set of rules in what is called an "output format".

You can have different output formats, each with its own characteristics. For example you certainly want that when multiple bibliographic records are displayed at the same time (as it happens in search results), only short versions are shown to the user , while a detailed record is preferable when a single record is displayed, whatever the type of the record.
You might also want to let your users decide which kind of output they want. For example you might need to display HTML for regular web browsing, but would also give a BibTeX version of the bibliographic reference for direct inclusion in a LaTeX document.

To summarize, an output format groups similar kind of formats, specifying which kind of formatting has to be done, but not how it has to be done.

Add an Output Format

To add a new output format, go to the Manage Output Formats page and click on the "Add New Output Format" button at the bottom of the page. The format has been created. You can then specify the attributes of the output format. See Edit the Attributes of an Output Format to learn more about it.

For developers and adventurers only:

Alternatively you can directly add a new output format file into the /etc/bibformat/outputs/ directory of your CDS Invenio installation, if you have access to the server's files. Use the format extension .bfo for your file.

You should also check that user www-data has read/write access to the file, if you want to be able to modify the rules through the web interface.

Remove an Output Format

To remove an output format, go to the Manage Output Formats page and click on the "Delete" button facing the output format you want to delete. If you cannot click on the button (the button is not enabled), this means that you do not have sufficent priviledge to do so (Format is protected. Contact the administrator of the system).

For developers and adventurers only:

You can directly remove an output format from the /etc/bibformat/outputs/ directory of your CDS Invenio installation. However you must make sure that it is removed from the tables format and formatname in the database, so that other modules know that it is not longer available.

Edit the Rules of an Output Format

When you create a new output format, you can at first only specify the default template, that is the one which is used when all rules fail. In the case of a basic output format, this is enough. You can however add other rules, by clicking on the "Add New Rule" button.
Once you have added a rule, you can fill it with a condition, and a template that should be used if the condition is true. For example the rule

Rule: Use template [Picture HTML Detailed] if field [980.a] is equal to [PICTURE]

will use template named "Picture HTML Detailed" if the field 980.a of the record to format is equal to "Picture". Note that text "PICTURE" will match any letter case like "picture" or "Picture". Leading and trailing spaces are ignored too (" Picture " will match "PICTURE").
Tips: you can use a regular expression as text. For example "PICT.*" will match "pictures" and "PICTURE".

The above configuration will use format template "Default HTML Detailed" if all above rules fail (in that case if field 980.a is different from "PICTURE"). If you have more rules, you decide in which order the conditions are evaluated. You can reorder rules by clicking on the small arrows on the left of the rules. -

Note that when you are migrating your output formats from the old PHP BibFormat, you might not have translated all the formats to which your output formats refers. In that case you should use defined in old BibFormat option in the format templates menu, to make BibFormat understand that a match for this rule must trigger a call to the Behaviour of the old BibFormat. See section on Run old and new formats side by side for more details on this.

For developers and adventurers only:

To write an output format, use the following syntax:
First you define which field code you put as the conditon for the rule. You suffix it with a column. Then on next lines, define the values of the condition, followed by --- and then the filename of the template to use:

   tag 980.a:
   PICTURE --- PICTURE_HTML_BRIEF.bft
   PREPRINT --- PREPRINT_HTML_BRIEF.bft
   PUBLICATION --- PUBLICATION_HTML_BRIEF.bft

This means that if value of field 980.a is equal to PICTURE, then we will use format template PICTURE_HTML_BRIEF.bft. Note that you must use the filename of the template, not the name. Also note that spaces at the end or beginning are not considered. On the following lines, you can either put other conditions on tag 980.a, or add another tag on which you want to put conditions.

At the end you can add a default condition:

    default: PREPRINT_HTML_BRIEF.bft

which means that if no condition is matched, a format suitable for Preprints will be used to format the current record.

The output format file could then look like this:

   tag 980.a:
   PICTURE --- PICTURE_HTML_BRIEF.bft
   PREPRINT --- PREPRINT_HTML_BRIEF.bft
   PUBLICATION --- PUBLICATION_HTML_BRIEF.bft
 
   tag 8560.f:
   .*@cern.ch --- SPECIAL_MEMBER_FORMATTING.bft
 
   default: PREPRINT_HTML_BRIEF.bft

You can add as many rules as you want. Keep in mind that they are read in the order they are defined, and that only first rule that matches will be used. Notice the condition on tag 8560.f: it uses a regular expression to match any email address that ends with @cern.ch (the regular expression must be understandable by Python)

Edit the Attributes of an Output Format

An output format has the following attributes:

code: a short identifier that is used to identify the output format. It must be unique and contain a maximum of 6 letters. Note that the code is not case sensitive ("HB" is equal to "hb").
content type: this is the content type of the format, specified in Mime. For example if you were to produce an Excel output, you could use application/ms-excel as content type. If a content type is specified, CDS Invenio will not print the usual header and footerfor the page, but will trigger a download in the client's browser when viewing the page (Unless the browser handles this content type).
name: a generic name to display in the interface for this output format.
(*) name: internationalized names for the output format, used for displaying localized name in the search interface.
description: an optional description for the output format.

Please read this information regarding output format codes: There are some reserved codes that you should not use, or at least be aware of when choosing a code for your output format. The table below summarizes these special words:

Code Purpose
HB Used for displaying list of results of a search.
HD Used when no format is specified when viewing a record.
HM Used for Marc output. The format is special in the sense that it filters fields to display according to the 'ot' GET parameter of the HTTP request.
Starting with letter 't' Used for displaying the value of the field specified by the 'ot' GET parameter of the HTTP request.
Starting with 3 digits Used for displaying the value of the field specified by the digits.

Code	Purpose
HB	Used for displaying list of results of a search.
HD	Used when no format is specified when viewing a record.
HM	Used for Marc output. The format is special in the sense that it filters fields to display according to the 'ot' GET parameter of the HTTP request.
Starting with letter 't'	Used for displaying the value of the field specified by the 'ot' GET parameter of the HTTP request.
Starting with 3 digits	Used for displaying the value of the field specified by the digits.

For developers and adventurers only:

Excepted for the code, output format attributes cannot be changed in the output format file. These attributes are saved in the database. As for the code, it is the name of the output format file, without its .bfo extension. If you change this name, do not forget to propagate the modification in the database.

Check the Dependencies an Output Format

To check the dependencies of an output format on format templates, format elements and tags, go to the Manage Output Formats page, click on the output format you want to check, and then in the menu click on "Check Dependencies".

The next page shows you:

the format templates which might be called by the rules of the output format
the elements used in each of these templates
the Marc tags involved in these elements

Note that some Marc tags might be omitted.

Check the Validity an Output Format

To check the validity of an output format, simply go to the Manage Output Formats page, and look at the column 'status' for the output format you want to check. If message "Ok" is there, then no problem was found with the output format. If message 'Not Ok' is in the column, click on it to see the problems that have been found for the output format.

Format Templates

A format template defines the how a record should be formatted. For example it specifies which fields of the record are to be displayed, in which order and with which visual attributes. Basically the format template is written in HTML, so that it is easy for anyone to edit it.

Add a Format Template

To add a new format template, go to the Manage Format Templates page and click on the "Add New Format Template" button at the bottom of the page. The format has been created. You can then specify the attributes of the format template. See Edit the Attributes of a Format Template to learn more about it.

For developers and adventurers only:

Alternatively you can directly add a new format template file into the /etc/bibformat/format_templates/ directory of your CDS Invenio installation, if you have access to the server's files. Use the format extension .bft for your file.

You should also check that user www-data has read/write access to the file, if you want to be able to modify the code and the attributes of the template through the web interface.

Remove a Format Template

To remove a format template, go to the Manage Format Templates page and click on the "Delete" button facing the format template you want to delete. If you cannot click on the button (the button is not enabled), this means that you do not have sufficent priviledge to do so (Format is protected. Contact the administrator of the system).

For developers and adventurers only:

You can directly remove the format template from the /etc/bibformat/format_templates/ directory of your CDS Invenio installation.

Edit the Code of a Format Template

You can change the formatting of records by modifying the code of a template.

To edit the code of a format template go to the Manage Format Templates page. Click on the format template you want to edit to load the template editor.

The format template editor contains three panels. The left upper panel is the code editor. This is were you write the code that specifies the formatting of a template. The right-most panel is a short documentation on the "bricks" you can use in your format template code. The panel at the bottom of the page allows you to preview the template.

The following sections explain how to write the code that specifies the formatting.

Basic Editing

The first thing you have to know before editing the code is that everything you write in the code editor is printed as such by BibFormat. Well almost everything (as you will discover later).

For example if you write "My Text", then for every record the output will be "My Text". Now let's say you write "<b>My Text</b>": the output will still be "<b>My Text</b>", but as we display in a web browser, it will look like "My Text" (The browser interprets the text inside tags <b></b> as "bold". Also note that the look may depend on the CSS style of your page).

Basically it means that you can write HTML to do the formatting. If you are not experienced with HTML you can use an HTML editor to create your layout, and the copy-paste the HTML code inside the template.

Do not forget to save your work by clicking on the save button before you leave the editor!

For developers and adventurers only:

You can edit the code of a template using exactly the same syntax as in the web interface. The code of the template is in the template file located in the /etc/bibformat/format_templates/ directory of your CDS Invenio installation. You just have to take care of the attributes of the template, which are saved in the same file as the code. See Edit the Attributes of a Format Template to learn more about it.

Use Format Elements

To add a dynamic behaviour to your format templates, that is display for example a different title for each record or a different background color depending on the type of record, you can use the format elements.

Format elements are the smart bricks you can copy-paste in your code to get the attributes of template that change depending on the record. A format element looks like a regular HTML tag.

For example, to print the title of a record, you can write <BFE_TITLE /> in your template code where you want to diplay the title

Format elements can take values as parameters. This allows to customize the behaviour of an element. For example you can write <BFE_TITLE prefix="Title: " />, and BibFormat will take care of printing the title for you, with prefix "Title: ". The difference between Title: <BFE_TITLE /> and <BFE_TITLE prefix="Title: " /> is that the first option will always write "Title: " while the second one will only print "Title: " if there exist a title for the record in the database. Of course there are chances that there is always a title for each record, but this can be useful for less common fields.

Some parameters are available for all elements. This is the case for the following ones:

prefix: a prefix printed only if the record has a value for the element.
suffix: a suffix printed only if the record has a value for the element.
default: a default value printed if the record has no value for the element. In that case prefix and suffix are not printed.

Some parameters are specific to elements. To get information on all available format elements you can read the Format Elements Documentation, which is generated dynamically for all existing elements. it will show you what the element do and what parameters it can take.

While format elements looks like HTML tags, they differ in the followings ways from traditional ones:

A format element is a single tag: you cannot have <BFE_TITLE >some text<BFE_TITLE /> but only <BFE_TITLE />.
The values of the parameters accept any characters, including < and >. The only limitation is that you cannot use the type of quotes that delimit that value: you can have for example <BFE_TITLE someParam="a lot of single quotes ' ' ' ' "/> or <BFE_TITLE someParam='a lot of double quotes " " " '/>, but not <BFE_TITLE someParam="a lot of same quotes as delimiter " " " "/>.
Format elements names always start with BFE_.
Format element can expand on multiple lines.

Tips: you can use the special element <BFE_FIELD tag="" /> to print the value of any field of a record in your templates. This practice is however not recommended because it would necessitate to revise all format templates if you did change the meaning of the MARC code schema.

Preview a Format Template

To preview a format template go to the Manage Format Templates page and click on the format template you want to preview to open the template editor. The editor contains a preview panel at the bottom of the page.

Simply click on " Reload Preview" button to preview the template (you do not need to save the code before previewing).
Use the "Language" menu to preview the template in a given language

You can fill in the "Search Pattern" field to preview a specific record. The search pattern uses exactly the same syntax as the one used in the web interface. The only difference with the regular search engine is that only the first matching record is shown.

For developers and adventurers only:

If you do not want to use the web interface to edit the templates but still would like to get previews, you can open the preview frame of any format in a new window/tab. In this mode you get a preview of the template (if it is placed in the /etc/bibformat/format_templates/ directory of your CDS Invenio installation). The parameters of the preview are specified in the url:

bft: the filename of the format template to preview
ln: the language to use for the preview
pattern_for_preview: the search pattern to use for the preview

Internationalization (i18n)

You can add translations to your format templates. To do so enclose the text you want to localize with tags corresponding to the two letters of the language. For example if we want to localize "title", write <en>Title</en>. Repeat this for each language in which you want to make "title" available: <en>Title</en><fr>Titre</fr><de>Titel</de>. Finally enclose everything with <lang> </lang> tags: <lang><en>Title</en><fr>Titre</fr><de>Titel</de></lang>

For each <lang> group only the text in the user's language is displayed. If user's language is not available in the <lang> group, your default CDS Invenio language is used.

Edit the Attributes of a Format Template

To edit the attributes of a format template go to the Manage Format Templates page, click on the format template you want to edit, and then in the menu click on "Modify Template Attributes".

A format template contains two attributes:

Name: the name of the template
Description: a short description of the template

Note that changing these parameters has no impact on the formatting. Their purpose in only to document the template.

If the name you have chosen already exists for another template, you name will be suffixed with an integer so that the name is unique.

You should also be aware that if you change the name of a format template, all output formats that were linking to this template will be changed to match the new name.

For developers and adventurers only:

You can change the attriutes of a template by editing its file in the /etc/bibformat/format_templates/ directory of your CDS Invenio installation. The attributes must be enclosed with tags <name> </name> and <description> </description> and should ideally be placed at the beginning of the file.

Also note that the admin web interface tries to keep the name of the template in sync with the filename of the template. If the name is changed through the web interface, the filename of the template is changed, and all output formats that use this template are updated. You have to do update output formats manually if you change the filename of the template without the web interface.

Check the Dependencies of a Format Template

To check the dependencies of a format template go to the Manage Format Template page, click on the format template you want to check, and then in the menu click on "Check Dependencies".

The next page shows you:

The output formats that use this format template
the elements used in the template (and Marc tags use in these elements in parentheses)
A summary of all the Marc tags involved in the elements of the template

Note that some Marc tags might be omitted.

Check the Validity a Format Template

To check the validity of a format template, simply go to the Manage Format Templates page, and look at the column 'status' for the format template you want to check. If message "Ok" is there, then no problem was found with the template. If message 'Not Ok' is in the column, click on it to see the problems that have been found for the template.

Format Elements

Format elements are the bricks used in format templates to provide dynamic content to the formatting process. Their purpose is to allow non computer literate persons to easily integrate data from the records in the database into their templates.

Format elements are typically written in Python (there is an exception to that point which is dicussed in Add a Format Element). This brings great flexibily and power to the formatting process. This however restricts the creation of format elements to developers.

Add a Format Element

The most typical way of adding a format element is to drop a .py file in the lib/python/invenio/bibformat_elements directory of your CDS Invenio installation. See Edit the Code of a Format Element to learn how to implement an element.

The most simple way to add a format element is to add a en entry in the "Logical Fields" management interface of the BibIndex module. When BibFormat cannot find the Python format element corresponding to a given name, it looks into this table for the name and prints the value of the field declared for this name. This lightweight way of doing is straightforward but does not allow complex handling of the data (it limits to printing the value of the field, or the values of the fields if multiple fields are declared under the same label).

Remove a Format Element

To remove a Python format element simply remove the corresponding file from the lib/python/invenio/bibformat_elements directory of your CDS Invenio installation.

To remove a format element declared in the "Logical Fields" management interface of the BibIndex module simply remove the entry from the table.

Edit the Code of a Format Element

This section only applies to Python format elements. Basic format elements declared in "Logical Fields" have non configurable behaviour.

A format element file is like any regular Python program. It has to implement a format function, which returns a string and takes at least bfo as first parameter (but can take as many others as needed).

Here is for example the code of the "bfe_title.py" element:

 def format(bfo, separator=" "):
     """
     Prints the title of a record.
 
     @param separator separator between the different titles
     """
     titles = []
    
     title = bfo.field('245.a')
     title_remainder = bfo.field('245.b')
 
     titles.append( title + title_remainder )
 
     title = bfo.field('0248_a')
     if len(title) > 0:
         titles.append( title )
 
     title = bfo.field('246.a')
     if len(title) > 0:
         titles.append( title )
 
     title = bfo.field('246_1.a')
     if len(title) > 0:
         titles.append( title )
 
     return separator.join(titles)

In format templates this element can be called like a function, using HTML syntax:
<BFE_TITLE separator="; "/>
Notice that the call uses (almost) the filename of your element. To find out which element to use, BibFormat tries different filenames until the element is found: it tries to

ignore the letter case
replace underscore with spaces
remove the BFE_ from the name

This means that even if the filename of your element is "my element.py", BibFormat can resolve the call <BFE_MY_ELEMENT /> in a format template. This also means that you must take care no to have two format elements filenames that only differ in term of the above parameters.

The string returned by the format function corresponds to the value that is printed instead of the format element name in the format template.

The bfo object taken as parameter by format stands for BibFormatObject: it is an object that represents the context in which the formatting takes place. For example it allows to retrieve the value of a given field for the record that is being formatted, or the language of the user. We see the details of the BibFormatObject further below.

The format function of an element can take other parameters, as well as default values for these parameters. The idea is that these parameters are accessible from the format template when calling the elements, and allow to parametrize the behaviour of the format element.

It is very important to document your element: this allows to generate a documentation for the elements accessible to people writing format templates. It is the only way for them to know what your element do. The key points are:

Provide a docstring for the format function
For each of the parameters of the format function (except for bfo), provide a description using a Java-like doc syntax in the doc string:
@param my_param description for my param (one line per parameter)
You can use one @see followed by a comma separated list of elements filenames to provide a reference to other elements of interests related to this one:
@see my_element1.py, my element2.py

Typically you will need to get access to some fields of a record to display as output. There are two ways to this: you can access the bfo object given as parameter and use the provided (basic) accessors, or import a dedicated module and use its advanced functionalities.

Method 1: Use accessors of bfo:
bfo is an instance of the BibFormatObject class. The following methods are available:

get_record(): Returns the record of this BibFormatObject instance as a BibRecord structure. Allows advanced access on the structure using BibRecord.
control_field(tag): Returns the value of control field given by MARC tag.
field(tag):Returns the value of the field corresponding to MARC tag. If the value does not exist, return empty string.
fields(tag): Returns the list of values corresonding to MARC tag.If tag has an undefined subcode (such as 999C5), the function returns a list of dictionaries, whoose keys are the subcodes and the values are the values of tag.subcode. If the tag has a subcode, simply returns list of values corresponding to tag.
kb(kb, string, default=""): Returns the value of the string in the knowledge base kb. If kb does not exist or string does not exist in kb, returns default string.

Method 2: Use module BibRecord:
BibRecord is a module that provides advanced functionalities regarding access to the field of a record bfo.get_record() returns a structure that can be understood by BibRecord's functions. Therefore you can import the module's functions to get access to the fields you want.

Internationalization (i18n)

You can follow the standard internationalization procedure in use accross CDS Invenio sources. For example the following code will get you the translation for "Welcome" (assuming "Welcome" has been translated):

 from invenio.messages import gettext_set_language
 
 ln = bfo.ln
 _ = gettext_set_language(ln)
 
 translated_welcome =  _("Welcome")

Notice the access to bfo.ln to get access to the current language of the user. For simpler translations or behaviour depending on the language you can simply check the value bfo.ln to return your custom text.

Edit the Attributes of a Format Element

A format element has mainly four kinds of attributes:

Name: it corresponds to the filename of the element.
Description: the description is in the docstring of the format function (excepted lines prefixed with @param and @see).
Parameters descriptions: for each parameter of the format function, a line beginning with @param parameter_name and followed by the description of the parameter is present in the docstring of the format function.
Reference to other elements: one line beginning with @see and followed by a list of comma-separated format elements filenames in the in the docstring of the format function provides a link to related elements.

Check the Dependencies of a Format Element

There are two ways to check the dependencies of a format element. The simplest way is to go to the format elements documentation and click on "Dependencies of this element" for the element you want to check.

The second method to check the dependencies of an element is through regular unix tools: for example $ grep -r -i 'bfe_your_element_name' . inside the format templates directory will tell you which templates call your element.

Check the Validity of a Format Element

There are two ways to check the validity of an element. The simplest one is to go to the format elements documentation and click on "Correctness of this element" for the element you want to check.

The second method to check the validity of an element is through regular Python methods: you can for example import the element in the interactive interpreter and feed it with test parameters. Notice that you will need to build a BibFormatObject instance to pass as bfo parameter to the format function of your element.

Browse the Format Elements Documentation

Go to the format elements documentation. There is a summary of all available format elements at the top of the page. You can click on an element to go to its detailed description in the second part of the page.

Each detailed documentation shows you:

A description of what the element does.
A list of all parameters you can use for this element.
For each parameter, a description and the default value when parameter is ommitted.
A link to a tool to track the dependencies of your element.
A link to a tool to check the correctness of your element.
A link to a tool to test your element with custom parameters.

Preview a Format Element

You can play with a format element parameters and see the result of the element directly in the format elements documentation: for each element, under the section "See also", click on "Test this element". You are redirected to a page where you can enter a value for the parameters. A description is associated with each parameter as well as an indication of the default value of the parameter if you do not provide a custom value. Click on the "Test!" button to see the result of the element with your parameters.

Knowledge Bases

Knowledge bases are a way to define easily extendable repositories of mappings. Their use is various, but their main purpose is to get, given a value, the normalized version of this value. For example you may use a knowledge base to hold a list of all ways to abbreviate the name of a journal, and map these abbreviations to the full journal name. This would be useful to get a normalized journal name accross all of your records.

The knowledge base itself offers no method to do this normalization. It is limited to the archiving of this knowledge. To benefit from the normalization you need to use a format element which is knowledge-base-aware. The element will look by iteself into the knowledge base to format a record. In that way you can extend the formatting capabilities of this element without having to modify it.

Add a Knowledge Base

To add a knowledge base go to the Manage Knowledge Bases administration page. At the bottom of the page click on the "Add New Knowledge Base" button. The knowledge base has been created and you are asked to fill in its attribute. See Edit the Attributes of a Knowledge Base to learn more about the attributes of knowledge bases.

Remove a Knowledge Base

To remove a knowledge base go to the Manage Knowledge Bases administration page. Click on the "Delete" button facing the knowledge base you want to remove and confim. The knowledge base and all the mapping it includes are removed.

Add a Mapping

Go to the Manage Knowledge Bases administration page and click on the knowledge base for which you want to add a mapping. Fill in the form of the "Add New Mapping" section on the left of the page with the new mapping, and click on "Add New Mapping". The mapping has been created. Alternatively you can create the mapping without its attributes, and fill them afterward (See Edit a Mapping).

Remove a Mapping

Go to the Manage Knowledge Bases administration page and click on the knowledge base for which you want to remove a mapping. Click on the "Delete" button facing the mapping you want to delete.

Edit a Mapping

Go to the Manage Knowledge Bases administration page and click on the knowledge base for which you want to edit a mapping. Locate the mapping in the list. You can click on the column headers to order the list by Map From or by Map To to help you find it. Once you have edited the mapping click on the corresponding "Save" button.

Edit the Attributes of a Knowledge Base

Go to the Manage Knowledge Bases administration page and click on the knowledge base you want to edit. In the top menu, click on "Knowledge Base Attributes". You can then give your knowledge base a name and a description. Finally click on the "Update Base Attributes" button.

Check the Dependencies a Knowledge Base

To check the dependencies of a knowledge base go to the Manage Knowledge Bases page, click on the knowledge base you want to check, and then in the menu click on "Knowledge Base Dependencies".

The next page shows you the list of format elements that use this knowledge base.

Note that some format elements might be omitted.

MARC Notation in Formats

The notation for accessing fields of a record are quite flexible. You can use a syntax strict regarding MARC 21, but also a shortcut syntax, or a syntax that can have a special meaning.

The MARC syntax is the following one: tag[indicator1][indicator2] [$ subfield] where tag is 3 digits, indicator1 and indicator2 are 1 character each, and subfield is 1 letter.

For example to get access to an abstract you can use the MARC notation 520 $a. You can use this syntax in BibFormat. However you can also:

Omit any whitespace character (or use as many as you want)
Omit the $ character (or use as many as you want)
Omit or use both indicators. You cannot specify only one indicator. If you need to use only one, use underscore _ character for the other indicator.
Use percent % instead of any character to specify all ("don'care" or wildcard character) for that character.

Migrating from Previous BibFormat

The new Python BibFormat formats are not backward compatible with the previous formats. New concepts and capabilities have been introduced and some have been dropped. If you have not modified the "Formats" or modified only a little bit the "Behaviours" (or modified "Knowledge Bases"), then the transition will be painless and automatic. Otherwise you will have to manually rewrite some of the formats. This should however not be a big problem. Firstly because the CDS Invenio installation will provide both versions of BibFormat for some time. Secondly because both BibFormat versions can run side by side, so that you can migrate your formats while your server still works with the old formats. Thirdly because we provide a migration kit that can help you go through this process. Finally because the migration is not so difficult, and because it will be much easier for you to customize how BibFormat formats your bibliographic data.

The first thing you should do is to read the Five Minutes Introduction to BibFormat to understand how the new BibFormat works. We also assume that you are familiar with the concepts of the old BibFormat. As the new formats separate the presentation from the business logic (i.e. the bindings to the database), it is not possible to automatically handle the translation. This is why you should at least be able to read and understand the formats that you want to migrate.

Differences between old and new BibFormat

The most noticeable differences are:

a) "Behaviours" have been renamed "Output formats".
b) "Formats" have been renamed "Format templates". They are now written in HTML.
c) "User defined functions" have been dropped.
d) "Extraction rules" have been dropped.
e) "Link rules" have been dropped.
f) "File formats" have been dropped.
g) "Format elements" have been introduced. They are written in Python, and can simulate c), d) and e).
h) Formats can be managed through web interface or through human-readable config files.
i) Introduction of tools like validator and dependencies checker.
j) Better support for multi-language formatting.

Some of the advantages are:

+ Management of formats is much clearer and easier (less concepts, more tools).
+ Writing formats is easier to learn : less concepts to learn, redesigned work-flow, use of existing well known and well documented languages.
+ Editing formats is easier: You can use your preferred HTML editor such as Emacs, Dreamweaver or Frontpage to modify templates, or any text editor for output formats and format elements. You can also use the simplified web administration interface.
+ Faster and more powerful templating system.
+ Separation of business logic (output formats, format elements) and presentation layer (format templates). This makes the management of formats simpler.

The disadvantages are:

- No backward compatibility with old formats.
- Stricter separation of business logic and presentation layer:
no more use of statements such as if(), forall() inside templates, and this requires more work to put logic inside format elements.

Migrating behaviours to output formats

Behaviours were previously stored in the database and did require to use the evaluation language to provide the logic that choose which format to use for a record. They also let you enrich records with some custom data. Now their use has been simplified and rectricted to equivalence tests on the value of a field of the record to define the format template to use.

For example, the following behaviour:

CONDITIONS
0	$980.a="PICTURE"
Action (0)	"<record> <controlfield tag=\"001\">" $001 "</controlfield> <datafield tag=\"FMT\" ind1=\"\" ind2=\"\"> <subfield code=\"f\">hb</subfield> <subfield code=\"g\">" xml_text(format("PICTURE_HTML_BRIEF")) " </subfield> </datafield> </record>"

100	""=""
Action (0)	"<record> <controlfield tag=\"001\">" $001 "</controlfield> <datafield tag=\"FMT\" ind1=\"\" ind2=\"\"> <subfield code=\"f\">hb</subfield> <subfield code=\"g\">" xml_text(format("DEFAULT_HTML_BRIEF")) " </subfield> </datafield> </record>"

translates to the following output format (in textual configuration file):

tag 980.a: PICTURE --- Picture_HTML_brief.bft default: Default_HTML_brief.bft

or visual representation through web interface:
Image representation of HB output format

We suggest that you use the migration kit to produce initial output formats from your behaviours, but that you go through the created .bfo files in the /etc/bibformat/output_formats/ directory of your CDS Invenio installation to check that they correspond to your behaviours.

Migrating formats to format templates and format elements

The migration of formats is the most difficult part of the migration. You will need to separate the presentation code (HTML) from the business code (iterations, tests and calls to the database). Here are some tips on how you can do this:

If you want to save the time of unescaping all HTML characters and understanding how the layout should look like, just go with your web browser to a formatted version of the format in your CDS Invenio installation, and copy the source of the web page. Identify the parts of the HTML code which are specific to the current record, and replace them with a call to the corresponding format element.
If you have made small modifications to the old default provided formats, we suggest that you use the new provided ones and modify them according to your needs.

We recommend that you do not use the migration kit for this part: it can help you create the initial files, but will never be able to provide a working implementation of the formats.

Migrating Knowledge Bases

We recomment yo use the migration kit to migrate your knowledge bases. It should have no problem to migrate this part of your configuration.

Migrating UDFs and Link rules

User Defined Functions and Link rules have been dropped in the new BibFormat. These concepts have no reasons to be as they can be fully implemented in the format elements. For example the AUTHOR_SEARCH link rule can directly be implemented in the Authors.bfe element.

As for the UDFs, most of them are directly built-in functions of Python. Whenever a special function as to be implemented, it can be defined in a regular Python file and used in any element.

The Migration Kit

The migration kit is available from the main BibFormat admin webpage or directly here. The migration kit has 3 steps, each migrating some part of your configuration. Just click on the links to migrate each part and get the status of the migration.

You should note that each migration will create new files or entries in the database, such that you will certainly want to click only once on each step (otherwise you will get duplicates).

The migration kit can:
a) Effortlessly migrate your behaviours, unless they include complex logic, which usually they don't.
b) Help you migrate formats to format templates and format elements.
c) Effortlessly migrate your knowledge bases.

Point b) is the most difficult to achieve: previous formats did mix business logic and code for the presentation, and could use PHP functions. The new BibFormat separates business logic and presentation, and does not support PHP. The transition kit will try to move business logic to the format elements, and the presentation to the format templates. These files will be created for you, includes the original code and, if possible, a proposal of Python translation. We recommend that you do not to use the transition kit to translate formats, especially if you have not modified default formats, or only modified default formats in some limited places. You will get cleaner code if you write format elements and format templates yourself.

+ +

Run old and new formats side by side

You might want to migrate your formats over a long period of time, making new formats available to your +users once they have been migrated, while old formats are still being used if they have not been translated. +BibFormat will do this almost automatically. This section tells you what you should be aware of if you want this to work seamlessly.

When BibFormat has to format a record with a given output format code, it first tries to find a corresponding +output format in the (new) output formats directory. If the output format cannot be found, it handle the formatting process +to the old BibFormat, which will look for a behaviour with a name corresponding to code. This leads to the first rule you should follow:

For each of the Behaviours you want to migrate, you should have an Output Format with a code corresponding to the name of the Behaviour.

The second (and last) rule is as simple as the first one. Imagine you have a Behaviour "HD" that you want to migrate to Output Format "HB". Let's say that "HD" links to 'picture_HTML_detailed' format if field 980$a is equal to "Picture", and links to 'default_HTML_detailed' in all other cases, but that 'picture_HTML_detailed' has not been migrated to a new format template. Then second rule says:

Output Formats should have the same conditions on tags as Behaviours, even if format for that condition has not been migrated.

In our example if you open the "HD" ouput format in the web interface, we can add a rule that works on condition "If 980$a is PICTURE" and set the template to be used to "defined in old BibFormat" in the template menu. This looks strange, this is the only way to tell BibFormat that it should consider this condition and not go to the default rule and use the default template.

+For developers and adventurers only: +

If you are to write Output Formats without the web interface, you should use the name migration_in_progress for each template which has not been migrated. The above example would therefore become:
+ tag 980.a : +PICTURE --- migration_in_progress +default: Default_HTML_detailed.bft +

FAQ

Why do we need output formats? Wouldn't format templates be sufficient?

As you potentially have a lot of records, it is not conceivable to specify for each of them which format template they should use. This is why this rule-based decision layer has been introduced.

How can I protect a format?

As a web user, you cannot protect a format. If you are administrator of the system and have access to the format files, you can simply use the permission rights of your system, as BibFormat is aware of it.

Why cannot I edit/delete a format?

The format file has certainly been protected by the administrator of the server. You must ask the administrator to unprotect the file if you want to edit it.

How can I add a format element from the web interface?

Format elements cannot be added, removed or edited through the web interface. This limitation has been introduced to limit the security risks caused by the upload of Pythonic files on the server. The only possibility to add a basic format element from the web interface is to add a en entry in the "Logical Fields" management interface of the BibIndex module (see Add a Format Element)

Why are some Marc codes omitted in the "Check Dependencies" pages?

When you check the dependencies of a format, the page reminds you that some use of Marc codes might not be indicated. This is because it is not possible (or at least not trivial) to guess that the call to field(str(5+4)+"80"+".a") is equal to a call to field("980.a"). You should then not completely rely on this indication.

How are displayed deleted record?

By default, CDS Invenio displays a standard "The record has been deleted." message for all output formats with a 'text/html' content type. Your output format, format templates and format elements are bypassed by the engine. However, for more advanced output formats, CDS Invenio goes through the regular formatting process and let your formats do the job. This allows you to customize how a record should be displayed once it has been deleted.

Why are some format elements omitted in the "Knowledge Base Dependencies" page?

When you check the dependencies of a knowledge base, the page reminds you that format elements using this knowledge base might not be indicated. This is because it is not possible (or at least not trivial) to guess that the call to kb(e.upper()+"journal"+"s") in a format element is equal to a call to kb("Ejournals"). You should then not completely rely on this indication.

Why are some format elements defined in field table omitted in the format element documentation?

Some format elements defined in the "Logical Fields" management interface of the BibIndex module (the basic format elements) are not shown in the format elements documentation pages. We do not show such an element if its name starts with a number. This is to reduce the number of elements shown in the documentation as the logical fields table contains a lot of not so useful fields to be used in templates.

Old PHP BibFormat Aministration Guide

1. Overview
2. Configuring BibFormat
3. Running BibFormat
       3.1 From Web interface
       3.2 From the command-line interface
4. Detailed Configuration Manual
       4.1 About BibFormat
       4.2 How it works?
       4.3 A first look at the web configuration interface
       4.4 Mapping the input (OAI Extraction Rules)
       4.5 Defining output types: Behaviors
       4.6 Formats
       4.7 Knowledge bases (KBs)
       4.8 User Defined Functions (UDFs)
       4.9 Defining links
             4.9.1 EXTERNAL link conditions
             4.9.2 INTERNAL link conditions
             4.9.3 Example
       4.10 User management
       4.11 Evaluation Language Reference

1. Overview

The BibFormat admin interface enables you to specify how the bibliographic data is presented to the end user in the search interface and search results pages. For example, you may specify that titles should be printed in bold font, the abstract in small italic, etc. Moreover, the BibFormat is not only a simple bibliographic data output formatter, but also an automated link constructor. For example, from the information on journal name and pages, it may automatically create links to publisher's site based on some configuration rules.

2. Configuring BibFormat

By default, a simple HTML format based on the most common fields (title, author, abstract, keywords, fulltext link, etc) is defined. You certainly want to define your own ouput formats in case you have a specific metadata structure.

Here is a short guide of what you can configure:

Behaviours
Define one or more output BibFormat behaviours. These are then passed as parameters to the BibFormat modules while executing formatting.
Example: You can tell BibFormat that is has to enrich the incoming metadata file by the created format, or that it only has to print the format out.
Extraction Rules
Define how the metadata tags from input are mapped into internal BibFormat variable names. The variable names can afterwards be used in formatting and linking rules.
Example: You can tell that 100 $a field should be mapped into $100.a internal variable that you could use later.
Link Rules
Define rules for automated creation of URI links from mapped internal variables.
Example: You can tell a rule how to create a link to People database out of the $100.a internal variable repesenting author's name. (The $100.a variable was mapped in the previous step, see the Extraction Rules.)
File Formats
Define file format types based on file extensions. This will be used when proposing various fulltext services.
Example: You can tell that *.pdf files will be treated as PDF files.
User Defined Functions (UDFs)
Define your own functions that you can reuse when creating your own output formats. This enables you to do complex formatting without ever touching the BibFormat core code.
Example: You can define a function how to match and extract email addresses out of a text file.
Formats
Define the output formats, i.e. how to create the output out of internal BibFormat variables that were extracted in a previous step. This is the functionality you would want to configure most of the time. It may reuse formats, user defined functions, knowledge bases, etc.
Example: You can tell that authors should be printed in italic, that if there are more than 10 authors only the first three should be printed, etc.
Knowledge Bases (KBs)
Define one or more knowledge bases that enables you to transform various forms of input data values into the unique standard form on the output.
Example: You can tell that Phys Rev D and Physical Review D are both the same journal and that these names should be standardized to Phys Rev : D.
Execution Test
Enables you to test your formats on your sample data file. Useful when debugging newly created formats.

To learn more on BibFormat configuration, you can consult the BibFormat Admin Guide.

3. Running BibFormat

3.1. From the Web interface

Run Reformat Records tool. This tool permits you to update stored formats for bibliographic records.
It should normally be used after configuring BibFormat's Behaviours and Formats. When these are ready, you can choose to rebuild formats for selected collections or you can manually enter a search query and the web interface will accomplish all necessary formatting steps.
Example: You can request Photo collections to have their HTML brief formats rebuilt, or you can reformat all the records written by Ellis.

3.2. From the command-line interface

Consider having an XML MARC data file that is to be uploaded into the CDS Invenio. (For example, it might have been harvested from other sources and processed via BibConvert.) Having configured BibFormat and its default output type behaviour, you would then run this file throught BibFormat as follows:

 $ bibformat < /tmp/sample.xml > /tmp/sample_with_fmt.xml

that would create default HTML formats and would "enrich" the input XML data file by this format. (You would then continue the upload procedure by calling successively BibUpload and BibWords.)

Now consider a different situation. You would like to add a new possible format, say "HTML portfolio" and "HTML captions" in order to nicely format multiple photographs in one page. Let us suppose that these two formats are called hp and hc and are already loaded in the collection_format table. (TODO: describe how this is done via WebAdmin.) You would then proceed as follows: firstly, you would prepare the corresponding output behaviours called HP and HC (TODO: note the uppercase!) that would not enrich the input file but that would produce an XML file with only 001 and FMT tags. (This is in order not to update the bibliographic information but the formats only.) You would also prepare corresponding formats at the same time. Secondly, you would launch the formatting as follows:

 $ bibformat otype=HP,HC < /tmp/sample.xml > /tmp/sample_fmts_only.xml

that should give you an XML file containing only 001 and FMT tags. Finally, you would upload the formats:

 $ bibupload < /tmp/sample_fmts_only.xml

and that's it. The new formats should now appear in WebSearch.

4. Detailed Configuration Manual

What follows is a transcription of an old FlexElink Configuration Manual v0.3 (2002-07-31). The text suffers from missing screen snapshots, and the terminology may not be fully up-to-date at places.

4.1. About BibFormat

BibFormat is a piece of software that is part of the CDS Invenio (http://cdsweb.cern.ch).

Its mission, in few words, is to provide a flexible mechanism to format the bibliographic records that are shown as a result of CDS Search user queries allowing the administrators or users customize the view of them. Besides, it offers the possibility of using a linking system that can generate automatically all the links included in the displayed records (fulltext access, electronic journals reference, etc) reducing considerably maintenance.

To clarify this too formal definition, we'll try to illustrate the role of BibFormat inside the CDS Search module by showing the following figure. Please, note that this drawing is trying to show the main role that BibFormat plays in the CDS structure and it's quite simplified, but of course the underlying logic is a bit more complex.

[Fig. 0]

As you can see, when a user query is received, Weblib determines which records from the database match it; then it ask BibFormat to format the obtained records. BibFormat looks at its rule repository and for each record determines which format has to be taken, applies the format specification and solves the possible links; gives all this (in a formatted way) back to Weblib and it makes a nice HTML page including the formatted results given by BibFormat among other info.

The good point in all this is that anyone that has access to BibFormat rule repository is able to modify the final appearance of a query result in the CDS Search module without altering the logic of the search engine.

In order to be able to modify this BibFormat rule repository, a web configuration interface is provided. Trough this paper, we'll try to explain (in a friendly way and form the user point of view) how to access this interface, how it's structured and how to configure BibFormat trough it to achieve desired results.

4.2. How it works?

We've outlined which is the role of BibFormat inside the CDS, so it's time now to have an overview of how it works and how it's organized. We'll try not to be very technical, however a few explanation about the BibFormat repository and architecture is needed to understand how it works.

BibFormat, basically, takes some bibliographic records as input and produces a formatted & linked version of them as output. By "formatted" we mean that BibFormat can produce an output containing a transformed version of the input data (normally an HTML view); the good part is that you can entirely specify the transformation to apply. At the same time, by "linked" we mean that you can ask BibFormat to include (if necessary) inside this formatted version references to some Internet resources that are related to the data from some pre-configured rules.

As an example, we could imagine that you'd want to see the resulting records from CDS Search queries to show their title in bold followed by their authors separated by comas. For achieving this you'll have to go to the BibFormat configuration interface and define a behavior for BibFormat in which you describe how to format incoming records:

 
   "<b>" $title "</b>"
   forall($author){
       $author separator(", ")
   }

Figure 1.- A very first Evaluation Language example

Don't be scared!! It's a first approach to the way BibFormat allows you to describe formats. As you can see, BibFormat uses a special language that you'll have to learn if you want to be able to specify formats or links; it seems difficult (as much as a programming language) but you'll see that it's quite more easy than it seems at first sight.

In the next figure, is shown how BibFormat works internally. When BibFormat is called, it receives a set of bibliographic records to format. It separates each record and translates it into a set of what we call "internal variables"; these "internal variables" are simply an internal representation of the bibliographic record; the important thing with them is that they will be available when you have to describe the formats. Once it has these "internal vars", the processor module looks into the behavior repository for that one (let's say format) you've asked BibFormat to apply (when BibFormat is called, you can indicate which of the pre-configured behaviors to apply; this allows it to have more than one behavior); inside this behavior you can specify which data you want to appear, how it has to appear, some links if they exist... in other words, the format (actually, it's something more than a format, it describes how BibFormat has to behave for a given input; that's why we refer to it as behavior). As we've already said, you can include links in a behavior specification; links are a special BibFormat feature that helps you to reduce the maintenance of your formats: you can include a link in several formats or behaviors.

The picture below, describes all this explanation.

[Fig. 2]

Summarizing, BibFormat can transform an input made up of bibliographic records in an HTML output (not only HTML but any text-based output) according to certain pre-configured specifications (behaviors) that you can entirely define using a certain language.

Just to mention, currently BibFormat is working taking OAI MARC XML as format for input records, but it can be adapted to other ways of inputs (reading a database, function call, etc) with a little of development.

4.3. A first look at the web configuration interface

BibFormat can be configured through its configuration interface that is accessible via web. It's made up of a bunch of web pages that present you the main configuration aspects of BibFormat allowing you to change them. In this section we are going to have a first look at this web interface, how it's structured and its correspondence with BibFormat features.

Before entering these web pages you'll be asked for your accessing username & password. Only certain users are allowed to access BibFormat WI; first you need a CDS account that you can create easily by using the standard CDS account manager; then you have to ask BibFormat administrator to give privileges to access the WI.

. Once your password is accepted you'll access the configuration interface. You'll see that is quite simple: It's structured in different sections; each of them corresponds to a BibFormat feature and you can navigate through them by using a navigation bar that is always present on the left.

[Fig. 3]

Here you are a list of the different sections the interface offers you and their correspondence with BibFormat features:

Behaviors: This is the main section, the one you enter by default when you access the web interface. It contains definitions for the different pre-configured output types or behaviors that allow you to define how you want BibFormat to behave when each output type is selected. More information in chapter Defining output types: Behaviors of this manual.
OAI Extraction Rules: The input types and mapping rules for OAI MARC XML inputs are defined here. You'll find here the information about all the internal variables and their correspondence with the input XML tags. See chapter Mapping the input of this manual for more information.
Link Rules: Allows you to access the link rules repository for defining the way links are generated. See chapter Defining Links for a more detailed description about the BibFormat linking system.
UDFs: Presents you a list of all the User Defined Functions (UDFs) that you can use inside Evaluation Language (EL) statements that are used for specifying different configuration aspects. You'll also be able to modify or extend this list within this section. Everything about using UDFs and defining new ones in chapter User Defined Functions (UDFs).
Formats: Another EL feature: You can define a certain piece of EL code under a name for re-using it whenever you want. See chapter Formats.
KBs: A complete management interface for Knowledge Bases (KBs); those KBs will also be available inside EL statements. See chapter Knowledge Bases(Kbs) for more specific information.
Execution Test: You'll be able to execute BibFormat from this section and view the results and some debug info in a web page. You have to specify an input data file (through a URL).
User management: Allows you to define which CDS users can access or not the BibFormat web interface.

Each section has different particularities but the way of dealing with them follows a common line through the interface. However, each section with their common things and particular characteristics are treated in the following chapters of this manual.

4.4. Mapping the input (OAI Extraction Rules)

We have already spoken a bit about BibFormat internal variables. These are a key point to understand the BibFormat way of working. As you know, BibFormat takes some bibliographic records as input and, according to some pre-configured behavior, formats them into HTML, for example. The problem is that this input records can come in several formats: different XML conventions, database records, etc. For now, at CDS we only consider that the input comes in OAI MARC XML but for the near future we'll may be have to extend it to accept other input formats.

That's the reason why internal variables appear; they provide a common way to refer to input data without relaying in any concrete format. In other words, we will define BibFormat links and behaviors referring to these internal variables and we'll have some rules that define how to map an input format to them, so we would be able to use any BibFormat defined behavior with any input that can be mapped to internal variables.

[Fig. 4]

You shouldn't worry about this because is more in the development/administration side, but it's important to know where internal variables come from and what they refer to. Besides, for CDS we only consider the incoming data in OAI MARC XML format, so we'll talk only about this case.

Internal variables are quite a simple concept: It's just a label that represents some values from the input. Besides, a variable can have fields that are also labels that represent values from the input but that are related to other under the variable (e.g. You can have a variable that maps authors and another that maps authors home institutes independently; but if you want to have represent an author and his home institute you need to relate these two variables in some way). Variables and their fields also support multiple values.

Focusing on OAI MARC XML, the concept of variable and field is already in the input structure.:

Each occurrence of OAI MARC XML varfield element will correspond to a different variable value.
Each occurrence of OAI MARC XML subfield inside a certain varfield element will correspond to a different field value of the variable that maps the varfield.

So what we will have in BibFormat is a set of rules that tells a variable name to which varfield element corresponds and each variable field name which subfield element maps. Trough the web interface you'll be able to add or delete new fields to variables or variables themselves, you'll be able even to modify the mapping tags of variables (this way you can keep your formats independent of changes in the meaning of MARC tags).

In the web interface, all this is located in OAI Ext. Rules section as you can see in the following figure:

[Fig. 5]

Let's illustrate how BibFormat maps a certain input to variables and fields with an example:

We have this variable & field definition on BibFormat:

Var.
label Mapping tag Mult. V. Fields

100 <varfield id="100" i1="" i2=""> Yes

Field label Mapping tag

a <subfield label="a">

e <subfield label="e">

909C0 <varfield id="909" i1="C" i2="0"> No

Field label Mapping tag

b <subfield label="b">

And then a record like the following arrives as input:

<oai_marc> <varfield id="037" i1="" i2=""> <subfield label="a">SCAN-0009119</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Racah, Giulio</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Guignard, G</subfield> <subfield label="e">editor</subfield> </varfield> <varfield id="909" i1="C" i2="0"> <subfield label="b">11</subfield> </varfield> <varfield id="909" i1="C" i2="0"> <subfield label="b">12</subfield> </varfield> </oai_marc>

The result of the mapping would be like this:

Variable "100"

Value# 0 Field "a" value Racah, Giulio

Value# 1 Field "a" value Guignard, G

Field "e" value editor

Variable "100"
Value# 0		Field "a" value	`Racah, Giulio`
Value# 1		Field "a" value	`Guignard, G`
Field "e" value	`editor`

Variable "909C0"

Value# 0 Field "b" value 12

Variable "909C0"
Value# 0		Field "b" value	`12`

Notice how varfield 037 is not considered because there isn't an entry in the BibFormat configuration. Also notice how the values are created: if "allow multiple values" is set to "Yes" each occurrence of a varfield element determines a new value (variable "100"); in other case, the last value is taken as single value for the variable (variable "909C0").

4.5. Defining output types: Behaviors

Now that we already know how internal variables are structured and what they represent in the input, it's time to have a look at how to configure BibFormat to transform that input data mapped into variables into HTML results (although any text-based output could be generated).

When BibFormat is asked to format a bunch of bibliographic records, it is also necessary to specify which output type it has to use. This output type is a string that identifies a pre-configured set of conditions and actions that tells BibFormat how to behave with the given input data (that's why the terms output type and behavior are used indifferently along this document).

BibFormat can have several pre-configured behaviors each one identified by a different label. There are two different types of behaviors (you can choose the behavior type when you define it):

Normal: Consists in a behavior that outputs exactly the result of its evaluation.
Input Erich (only for XML inputs): It echoes each xml record from the input inserting the behavior result just before the xml closing element of the record.

Each behavior contains an ordered list of conditions; a condition can contain zero or more associated actions (actions are ordered inside a condition). A condition is a behavior item described by an Evaluation Language expression that gives as result "TRUE" or "FALSE". An action is an Evaluation Language (EL) statement that produces any output.

When BibFormat is called to format a set of input records with a given behavior label, it looks for the behavior conditions. It evaluates their EL in order and when one of them produces "TRUE" as result, it looks for their associated actions. Then BibFormat evaluates the actions in the specified order and concatenates their result.

By using different conditions you can specify alternative formats inside a behavior (imagine that you want to format a record differently depending on its base number); it's true that you could also reach this solution by using EL IF statements, but it's more clear, efficient and re-usable (you can change one condition without touching the rest or you can give it more priority than others, that means give it the chance to be evaluated before others, by changing its apply order).

Actions are used for specifying the format itself or the actions you want to carry on with in case the condition is accomplished.

Through the web interface you can define new output types or modify the ones that already exist. The use is quite easy: you just have to select the link in the desired item with the operation you want to do over it.

[Fig. 6]

Let's have a look at a simple example to illustrate how to define behavior that fit our needs:

Imagine a typical case where you want to format bibliographic records but depending on their base number you want to apply different formats. Whenever a record from base 27 (standards) arrives we want only to show its title and the standard numbers, in other case a default format will be applied in which the title and authors are shown. We'll assume CDS variable notation and that the input rules are defined properly.

We are going to define a new NORMAL behavior for this new situation, let's call it SIMPLE. In it we'll need two conditions to be defined: one for applying the default format and another one for the 27-base special one. The base number comes in variable 909C0.b, so the conditions would be based on this variable content.

The result behavior should be defined like this:

SIMPLE(NORMAL)

10 $909C0.b="27"

"<b>"$245.a"</b>" forall($0248.a){ rep_prefix(" - ") $0248.a separator("; ") }

50 ""=""

"<b"$245.a"</b>" forall($100.a){ rep_prefix(" - Authors:") $100.a separator("; ") }

SIMPLE(NORMAL)
10	$909C0.b="27"
"<b>"$245.a"</b>" forall($0248.a){ rep_prefix(" - ") $0248.a separator("; ") }
50	""=""
"<b"$245.a"</b>" forall($100.a){ rep_prefix(" - Authors:") $100.a separator("; ") }

Some explanations on this example are needed:

As you can see we have defined two conditions: one for the 27-format and another for the default format. The point that is important is the order in which we put the conditions: For each record in the input the special one is evaluated first (because it has a lower evaluation number, 10) and if the condition is true the format will be applied; in case the base is not 27 the default condition is evaluated and because its condition EL code is always true the default will be used to format the record.
Don't worry too much about the action code because it's quite trivial. There are some "strange" things like the use of functions rep_prefix and separator. These are special UDFs that have a special behavior inside a FORALL statement:
- rep_prefix: Prints the string argument only when we are in the first iteration of a FORALL. In order words, put the prefix of the string which is to be generated by the FORALL statement.
- Separator: Prints the string argument in every FORALL iteration but not in the last one.

4.6. Formats

Formats are a special construction that BibFormat Evaluation Language (EL) offers. It allows you to group under an identifier some EL code and after you can call it from every EL statement.

You can manage these formats using the web interface. It is quite easy to do so: When you access the Formats section it will present you a list with all the format identifiers that are already defined and a small documentation about what's the format for. From there you can see the whole EL code by using the link [Code]. You can add a new format by using the set of input boxes that you'll find at the end of the page. Also delete and modify operations are possible for already defined formats.

[Fig. 7]

Note: When defining formats, one has to pay attention not to use "recursive" format calls (either direct or indirect); this can lead to execution problems. For example, imagine that we have a format called "ex 1" that has a call for itself:

Format "ex_1"

 "hello world" 
 format("ex_1")

this is a "direct" recursive call; you should never have these kind of calls as the web interface should warn you if it finds these kind of troubles. However, "indirect" calls are not detected by the web interface, so you have to care about them. One example of "indirect" recursion:

Format "ex_1"

 "hello world" 
 format("ex_2")

4.7. Knowledge bases (KBs)

This is yet another special feature provided by BibFormat Evaluation Language. In a few words, this allows you to map one string value to another according to a pre-stored set of key values that map to other values (the knowledge bases). All the knowledge bases are identified by a label that has to be unique (among other KBs identifiers); remember that identifiers are not case-sensitive.

These sets of values, normally lived in a file, but with this new development there was the need to have an easy KB management that was integrated in BibFormat. For this reason, you can manage KBs from the BibFormat configuration interface: section KBs.

When accessing to KBs section, the list of all the KBs identifiers defined will be displayed. Below it you'll find a set of controls to add new KBs; the use of these controls is as usual along the interface but there's something a bit special: Normally, you shouldn't fill in the input box that asks you for the Knowledge base table name; all the knowledge base data is handled by a database in which each KB corresponds to a DB table; this input box gets the internal table name for that KB; normally the KB manager will generate it for you so you shouldn't need to use it.

[Fig. 8]

Each KB has a link for accessing the list of values that it contains. If you click on it, a new window will show you the list of current values (key and mapped ones) and a very easy interface to add new values or to delete existing ones (KB values are case sensitive).

[Fig. 9]

4.8. User Defined Functions (UDFs)

The use of User Defined Functions (UDFs) is one of the more powerful features of BibFormat Evaluation Language (EL). The idea is that inside EL you can use operations or functions over strings; normally a large number of different string transformations are needed when talking about formatting but we cannot pretend implement all this operations inside EL because it's in constant growing and new needs appear all the time. For dealing with this problem, BibFormat defines a mechanism that allows you to use define as much functions (UDFs) as you want and use them inside any EL statement.

These functions are identified by a unique name and they receive data (over which they do operations) by parameters. These functions are defined in a programming language (PHP) and therefore good knowledge of this language is needed.

BibFormat offers a complete UDF management through the UDFs web interface section. There you'll see a complete list of all defined UDFs with their identifier, parameters and a small documentation about what the UDF does. You can also add, delete or modify UDFs or even have a look at the PHP code of an already defined function (there you'll be able to launch small tests over the defined functions).

[Fig. 10]

The definition of these functions should be reserved to administrators and some particularities have to be taken into account when defining UDFs:

When you want to add or modify a UDF you are asked for the parameter list; you have to enter the parameter names separated by comas. Ex: You want to define a new function for prefixing a given string with another, so you need two parameters (one for the string which is going to be prefixed, let's name it str, and another one for the prefix itself, let's name it prefix); you should enter them in the parameter input box like this: prefix, str
The order in which you specify the parameters when defining a function is the order in which they have to be passed to the UDF from an EL statement.
When defining the PHP code of a function, there are some important things to consider:
- The result of a function has to be a string.
- The parameters are available inside the PHP code as variables with the parameter name.
- The result of the function has to be defined by a PHP result clause giving the resulting string.
- Make sure the PHP code is correct (there's no way to know if the code is correct from BibFormat and it won't tell you if it is).
- There are some special variables available inside the PHP definition:
  - $FIRST_ITERATION: Is equal to "1" when we are in the first iteration of an EL FORALL statement. "0" in other case. If the call is made outside a FORALL is set to "1".
  - $LAST_ITERATION: Just the opposite case.
  With these two variables you can define FORALL special functions like a function to print a separator.

4.9. Defining links

As we've already said, BibFormat is not only a formatter but it also provides a link manager but, what do we mean by 'link manager'? The idea is to have a set of rules that describe how to generate a link using certain data; if the link can be generated from those rules, then the link manager can check different things (i.e. see if the link is valid, if it's a link to a file it can check if the file exists and in which formats it exists, etc) and finally return the solved link. In other words, if you have a set of bibliographic records that can contain a certain link and that link can be coded in the link manager rules, you don't need to store each link in each bibliographic record, you just use the link manager to generate them dynamically; like this, you only have to maintain a small set of rules and not thousands of static links in records.

BibFormat allows you to configure different link definitions each of them identified by a unique name; each of these link definitions have some associated parameters which are the information passed to the rules defined for it. Then, when you call the link manager to solve a link (from an EL statement, for example) you'll have to specify the identifier of the link definition you want to be used and the value for each of the parameters used by that link definition (always string values). The link manager will retrieve the rules associated to the link definition specified and will interpret those rules using the given parameter values, informing you if the link was generated correctly and result (the solved link).

BibFormat provides this mechanism and through the web interface you can access to the rule repository for having a look at what are the available link definitions, define new link rules or maintain already defined ones. When adding or modifying a link definition you'll have to specify the parameters, please remember to separate them by using comas.

[Fig. 10]

Link definitions are structurally quite similar to behaviors: Although there can be different types of them (as we'll see later), a link definition is made up of one or more conditions and each of these conditions can have one or more actions that tell how the link has to be built in case its condition is accomplished. In general, link rules (this includes conditions and actions) have a particular structure and they are described in Evaluation Language (EL) with one restriction: EL LINK statement cannot be used. Each group of conditions-actions of a link definition can be of a different solving type (actually, when you create a new link definition, its solving type its asked; this is just because all conditions that will be created for that link definition will have the selected solving type as default; but you can change it afterwards having a "mixed" link definition). Their structure and way the link manager interprets them will depend in their solving type. Currently, there you can define link conditions of two different solving types: EXTERNAL or INTERNAL. A more detailed explanation about each type is given later.

As we've said a link definition is made up of various link conditions. When a solving for a concrete link definition is asked, the link manager retrieves all link conditions associated to it. Then it takes the first of them (following the evaluation order - the lower is the evaluation order number, the first the condition is considered), it evaluates its EL code with the parameter values passed and if the result is "TRUE" associated actions are executed, the link is returned and the solving process finishes. In case a condition fails, it looks for the next one. If all the conditions fail then the link manager returns that the link couldn't be solved. This is the general behavior of the link manager, but the way of determining if a link has been solved or not and the link building depends on the condition solving type.

4.9.1. EXTERNAL link conditions

This is the simplest way of solving links. It's intended to be used when you want to generate a link that points to an external resource (normally a web page). In this case the link condition is composed by only one action that will be evaluated if the associated condition is "TRUE". When a condition of this type is evaluated "TRUE" and the action is executed, the result of the action is given as the solved link and the link manager finishes.

[Fig. 11]

4.9.2. INTERNAL link conditions

This condition solving type is intended to be used when you want to link to a document which is a file (inside or outside your file system) and that can be in different file formats.

This case is a bit more complex than the previous one, so we'll go step-by-step explaining differences and special features:

An INTERNAL condition has a base file path and a base URL associated. The base file path is the string that will be used as prefix when looking for a file generated by the actions associated to that condition. On the other hand, the base URL will be a string to which the link string (resulting from the actions) will be added (i.e. if the base file path of a condition is /tmp/docs and the base URL is http://doc.cern.ch/, if the condition is true and the result of the actions is test.pdf, the file path the link manager will have to check will be /tmp/docs/test.pdf and, if the file exists, the generated link will be http:/doc.cern.ch/test.pdf)
Any condition of this type can several associated file formats. This is a new concept that is only used for INTERNAL condition solving. A file format is simply a set of file extensions that are grouped under an identifier. Then, you can associate a file format identifier with a link condition. When the condition is true the link manager will combine each result from the condition actions with the associated file formats to check the existence of a file of any format; this means that when an action is evaluated, the link manager takes the file extensions of each associated file format identifier and checks if the file base path + resulting action string + file extension exists in the file system.
One condition of this type can have more than one associated action. Each of its actions describes an alternative way of building the file path. When a condition of this type is evaluated to "TRUE", the link manager retrieves its actions (following actions apply order) and evaluates the first one; with the action result it builds the file path in this way: file base path + resulting action string, and then combines this string with each of the file extensions. If any of the combination exists in the file system, the link is generated (if there are more than one file format combination that exist, the link variable will have multiple values containing the different links); if not, it starts the same process with the next action. If any of the actions drive to a existing file, the link is not generated.
When calling the link manager from a EL statement (see chapter Evaluation Language Reference), if the link is solved we'll be able to access to a special internal variable that contains as value the resulting link. In the INTERNAL condition links, we have said that this variable can contain multiple values in case the link manager finds different file formats. In this case, there's another extension that consists in having some special variable fields containing special values for each value in the LINK variable and to which you can access when the link is solved; here's a table detailing the different variable LINK fields which are defined when a INTERNAL condition link is solved:

Field name Value that contains

url The same value as the LINK variable: The solved URL.

file Contains the local full path to the file the solved URL points to.

format_id Contains the file format id string

format_desc Contains the file format description string (this is defined for each file format)

Field name	Value that contains
url	The same value as the LINK variable: The solved URL.
file	Contains the local full path to the file the solved URL points to.
format_id	Contains the file format id string
format_desc	Contains the file format description string (this is defined for each file format)

4.9.2 Example

As the link generation is quite a complex topic (specially when talking about INTERNAL linking) we'll try to illustrate it with a simple example.

Let's imagine we want to create a new link definition for generating full-text access to the documents that are archived on a document server (a file system which contains document's electronic versions). These documents are organized systematically depending in three characteristics that are included in the bibliographic records: BASE, CATEGORY and ID. When the base corresponds to "CERNREP" then the files are archived below directory /pub/www/home/cernrep/ and can be stored following two different criteria that depend on the CATEGORY and ID values; the documents are all HTML. However, if the base is "PREPRINT" and the CATEGORY is either "HEP-TH" or "HEP-PH" they are stored under directory /archive/electronic|/pub/www/home/ following a certain criteria; in this case the documents can be in several file formats: PDF, Postscript, MS Word.

Of course, we want only the link to be created if the files corresponding to the bibliographic records exist.

So we start creating a new link definition that we'll call FULLTEXT. It will receive three parameters that are the information we need for generating this kind of links: BASE, CATEGORY and ID. We select INTERNAL as solving type as default and then we fill it the base file path and url with some default values (these values are not important, they will be copied by default to the conditions we are going to create afterwards).

[Fig. 12]

Then we create a condition for the first possibility: when BASE is "CERNREP". We select INTERNAL as link condition because we want to link to a file and we want to check its existence and we fill in the base file path and URL with the corresponding values. Then we assign the file format types and we enter the file archiving criteria as different actions.

[Fig. 13]

For the other possibility we proceed in the same way by adapting the definition to the requirements; we'll have something like this as result:

[Fig. 14]

Once we have finished the link definition, we can insert links of this type from a BibFormat behavior, for example. Let's imagine we have included a piece of EL code like this in a behavior because we want to insert a link to the full-text documents of any record:

link("FULLTEXT",$base, $category, $id) { "Fulltext: " forall($link){ "<a href=\"" $link.url "\">" $link.format_desc "</a>" separator " - " } }

This EL statement will include the string "Fulltext: " followed by a link to all the documents found for the values of internal variables $base, $category, $id separated by " - ".

4.10. User management

The BibFormat web interface (WI) comes with a security mechanism which allows you to define which users can access the WI. BibFormat doesn't have a user management incorporated; instead it uses CDS user schema (as is a part of CDS). So if you are not registered as CDS user and you want to have access to BibFormat WI, first thing to do is to register in CDS through the standard procedure (for example via the CDS Search interface you can access the CDs account management system).

BibFormat WI access policy is rather simple: it keeps a list of CDS users that can access the WI. Then if someone tries to access any part of the WI, the system will ask the user to identify him as CDS user. If the CDS login is successful and the user is in BibFormat's access list, then the user will gain access to the WI.

There's a section in the WI which allows you to define which CDS users will have access to the WI. The use is rather simple: You can add CDS users to the access list by specifying either their CDS user id or their CDS login; then you can delete a CDS uses from the access list by simply selecting the link "delete" for the corresponding user.

[Fig. 15]

When you install BibFormat for the first time and you access to the WI you'll see that no login or password is asked. The security mechanism doesn't get activate until at least one user is added to the BibFormat's access list. So if you don't want to limit the access to BibFormat WI keep the access list without any user in.

4.11. Evaluation Language Reference

In this section we'll present a more or less formal definition of the Evaluation Language (EL); although we are using some formal methods to describe it we'll also make a quick explanation about the elements that made up the language and how to combine them to arrive to desired results.

Just below you can find the EL definition, expressed in terms of EBNF (Extended Backus-Naur Form) notation. We have used capital letters to express non-terminal elements and non-capital/bold characters for the terminal ones. There's one remark to make: Whenever you find the mark [REX] after any definition, it means that we have used a regular expression just before in order to express a set of non-terminals.

SENTENCE ::= TERM {&& TERM | || TERM} TERM ::= FACTOR {= FACTOR | != FACTOR | FACTOR} FACTOR ::= [!] BASIC BASIC ::= VARIABLE | LITERAL | FUNCTION | ( SENTENCE ) | FORALL | IF | FORMAT | LINK | COUNT | KB VARIABLE ::= $ STRING [. STRING] LITERAL ::= "([^"] | \")*" [REX] FUNCTION ::= STRING ( [ SENTENCE {, SENTENCE} ] ) FORALL ::= forall ( VARIABLE [, LITERAL] ) { SENTENCE } IF ::= if( SENTENCE ) { SENTENCE } [else { SENTENCE }] FORMAT ::= format( SENTENCE ) LINK ::= link( SENTENCE , [SENTENCE {, SENTENCE}] ) { SENTENCE } [else { SENTENCE }] COUNT ::= count( VARIABLE ) KB ::= kb( SENTENCE ) STRING ::= [a-zA-Z0-9_] [REX]

This is just a formal way of describing the language, but don't worry if you don't understand it very well because just below these lines we'll try to describe it in a more informal way.

To begin with, you should know that EL is a language designed to work with strings (a string is a collection of characters) but it has also some logic and comparison operations. One important thing you have to be aware of is that in EL blank spaces, tabulators or carriage returns have no more meaning than separator for elements of the language; that means that between two basic elements you can have as many spaces or carriage returns as you want.

One of the basic elements of the language is what we call LITERALS. These things represent constant string values; they are delimited by a pair of double quote (") symbols surrounding the string you want to express. Everything you put inside the double quotes will be considered as it is, so inside a literal several spaces or carriage have meaning (it's the only case). If you want to express a double quote symbol inside a literal you have to escape it using \.

Some examples of literals:

If you want to represent the string hello, inside the EL you'll have to use "hello".
For the string hello "big" man, the representation in EL is "hello \"big\" man" (notice the escape characters and that spaces have meaning).
Let's see \"" string has to be expressed in this way "Let's see \\\"\"".

Another important basic element of the language is VARIABLES. These elements represent string data from the input to which you can refer inside of the language (and is considered also as a string). Variables are defined in advance by the administrator (or even users) so you have to know which of them you have access to. Additionally, variables can contain FIELDS that are simply other input values that are grouped under a variable because they have some kind of relationship between them (for example, you could have a variable for the information about the author and fields like name, born place, etc for it). If you want to know more about variables and their correspondence with the input you can look at the Mapping the Input section. The way of expressing a variable in EL is by a dollar symbol followed by any letter, number or underscore; variables are case-insensitive. To refer to any field of a variable, you simply put a dot followed by the field name (which is also made up of any character, number or underscore).

Some examples about variables and fields:

Imagine you have a variable which contains the author information and which is called author, to represent in EL you would have to write "Author: " $100.a. In every place that $author appears BibFormat will consider the value defined for it from the current input record.
Then you know that the field name of variable author contains the author full name and you want to refer to it inside an EL statement, so you'd write $author.name.
If we speak about CDS configuration, variable and field names correspond to MARC 21 tag & indicator names; so to refer to the main title of a bibliographic record we should use variable 245 field b, in EL terms: $245.b.

Now that we know basic elements of the language we can start thinking about how to combine them. The most important (and unique) string operation is concatenation: adding strings. This operation is implicit to the language, so we just put language elements one before another, and the representation result will be the result of the basic elements one after another.

Some samples:

To represent the constant string Author: followed by the name of the author of the input record you should write "Author: " $100.a (it's supposed CDS configuration in which MARC 21 notation is used; authors correspond to variable 100 field a).
You want to output the title in bold (always HTML speaking) followed by the author in normal chars separated of the title by char /: "<b>" $245.b "</b>/" $100.a

These two, literals and variables, are only basic elements of the EL. You can combine them using concatenation to get new strings. But, of course, there are some more operations you can apply over strings: UDFs (User Defined Functions). We'll also name these elements as functions, because they are that: functions or operations to be applied over strings; when talking about strings we include basic elements or resulting string from applying any operations. A UDF has a name that identifies it uniquely and needs to get some information that we call parameters. A UDF gives another string as result depending on the parameter values (always strings). So to represent a function in EL you need its name followed by an open parenthesis, the parameter values separated by comas and a closing parenthesis. There's a list of UDF you can look at through the interface but this list can be extended to fit your needs (look at UDFs section of this manual).

Some examples:

You want to ensure that the title of a bibliographic record is always going to be in capital letters; good, there's a function called upper that takes one parameter and gives as result the parameter transformed in capital letters. You have to write the call like this: upper($245.b).
You want only the 3 first chars of an author name to appear in capital letters. We've seen there's a function for uppercasing a string but there's another one, called copy that gets a sub string from a string passed as first parameter from the char position indicated by the 2^nd parameter and with the length given by the 3^rd one: copy( upper($100.a), "0", "3").

As you can see, these UDFs are very powerful because you can concatenate their result with another element (literal, variable or even function) and the parameters can be basic elements or expressions. We can extend this ensuring that any element or expression of the EL that gives as result a string value can be combined with other EL expressions or elements.

Another very useful feature of EL is the possibility to use KWONLEDGE BASES (KBs). A KB is just a set of key values that map (one-to-one) another set of values; may be knowledge bases isn't a very appropriate name because they are more like translation tables. BibFormat offers tools to create and maintain KBs that can be used in the EL afterwards (see chapter KBs management in this manual). You can see KB invocation as a special function (the syntax for calling it is the same) with name kb and that takes two parameters: one for indicating the KB name (BibFormat can handle several KBs) and another one for the key value to translate. The result is the mapped KB value or an empty string if it doesn't exist as a key value in the specified KB. A typical example is when you have months with numbers and you want to translate them into month names; you could have a KB that maps all the month numbers to month names and then call it like this kb("MONTH", $m).

Now let's move to FORMATS. Formats are some EL code which is grouped under a label (a name) and that can be used in any other EL statement. BibFormat allows the user to define as many formats as he wants and identify each of them with a simple name. In few words, formats allow you to reuse EL code; within a format you can put any EL code (even other format calls) and all the variable values are completely available. Again, a format call in EL follows the same convention as functions: the word format followed by the format name (a string) between parenthesis. When you call a format is like if the EL code define inside that format was pasted, as it is in the place you make the call.

Example: Imagine you have to write the title of a bibliographic records with a certain format, let's say in bold and red; but this formatted title you are going to use it in several places. So can take advantages of EL formats and define a format called TITLE that contains the code "<font color=\"red\"><b>" $245.b "</b></font>". Once this is done, you could use it to format records by printing their title in that way and their author after it: format("TITLE") "/" $100.a. The good thing is that if some day you decide to change the title formatting you'd only need to modify the TITLE format definition and not all the places where you show the title.

At this point, you have seen basic elements and operations with EL. You may think that is powerful enough to express your formatting work, but there are more complex situations that you'll have to face. We have tried to design the EL to be easy enough but with the next advanced structures, sometimes, can arrive to be a bit complex.

All these basic elements and operations are quite OK. But there are sometimes where you want to compare expressions and decide what to do depending on the result of the comparison. For this purpose, EL has an IF statement and a few comparison and logic operators built in (don't forget that any functionality needed can be achieved by defining new UDFs; EL gives basic operations to provide this possibility). Let's go step by step: First let's talk about the set of operators that can be used in a comparison:

Comparison operators: Equal and non-equal (=, !=). They take two operators that have to be strings and produce a logic (true or false) value.
Logical operators: AND, OR and NOT (&&, ||, !). All of them have to be used over logical values, taking two operators AND and OR, and one operator NOT.

All of them are right associative (except NOT which is unary left-associative) and their precedence goes like this (more to less): NOT, (EQUAL, NON-EQUAL), (AND, OR). These operators cannot be used anywhere, only inside statements that expect a logic value as result, in other words, inside condition statements.

The IF structure is quite easy to learn: First we indicate the word IF followed by a condition statement surrounded by parenthesis; then a EL statement into braces can be specified, this statement will be executed only if the condition was true; optionally, we can add an ELSE word followed by another EL statement into braces, that will only be triggered if the IF condition was not true.

Let's have a look at some examples:

I want the title of a record to appear followed by the constant Author: and its author afterwards. But it could be nice if the constant string appeared only if the record has author:
```
 format("TITLE") if($100.a!="") { "Author: " $100.a }
 
```

BibFormat is not only an EL processor. Among others, it contains a link solver that contains it's own rule repository in order to be able to automatically solve links (see chapter Link solver of this manual). EL has one special structure for asking the link solver for some links and including them in the formatted version of the bibliographic record. This way links are easy to maintain (you modify the rules independently from where the link is being used) and as re-usable as formats or UDFs. Links are identified by a label and need some information to be passed as parameters; then an EL statement has to be specified which will be effective only if the link is solved and inside which, you'll have access to an special variable, named LINK, which contains the solved link among other information (see chapter Link solver for more information about which values are accessible); additionally, an else statement can be added (following the same syntax as in the IF construction) that will be effective only if the link can't be solved by the Link solver.

Example:

We are with our typical example of the simple format that contains the title and the author, but now we want the author to be linked to the search. Supposing that a this kind of link is already defined under the label "AUTHOR_SEARCH" we should proceed like this:
```
 format("TITLE") "/"
 link("AUTHOR_SEARCH", $100.a)
     { "<a href=\""$link "\">"$100.a"</a>"}
 
```

The next step when talking about EL components is to deal with multiple values. Life is no so easy and, of course, and a bibliographic record can have more than one author or can have a related document which is in more than one format and that has to be linked. In other words, BibFormat supports having variables and fields with multiple values (see chapter Mapping input), consequently a way of applying an EL statement over all the values of a variable or a field would be quite useful. FORALL is our construction!! It allows you to specify a variable or a field followed by a EL statement (between braces) that will be applied for every value of the variable or the field; any reference to the iteration variable inside the FORALL EL statement will be related to the current iteration variable value (if you refer to a variable that has multiple values outside a FORALL the first value is considered). One limitation is that you shouldn't nest FORALL statements, in other words, never put a FORALL inside another one. This construction let's you also limit the number of times you want to iterate over a variable or field by adding a literal with the number of iterations.

Some examples:

Let's continue refining our simple format; now we have to consider that there can be more than one author for one bibliographic record, so we want to show all of them with the link included, of course.
```
 
 format("TITLE") "/"
 forall($100.a)
 {
   link("AUTHOR_SEARCH", $100.a)
     { "<a href=\""$link "\">"$100.a"</a>"}
 }
 
```
Although this FORALL construction could seem not very useful, it's used a lot when defining formats or behaviors. Quite often you will have the case where you want only some EL piece of code to be effective if a certain variable or field exist; FORALL can also be used in that situation and it has to be said that is the most comfortable way of doing it. Imagine the case you want the title, the constant string "Author: " followed by the authors of a bibliographic record; but you don't want the constant "Author: " to appear if there's no author at all. You could use something like this:
```
 
 format("TITLE") " - "
 forall($100.a)
 {
   rep_prefix("Author: ") $100.a " "
 }
 
```
As you can see we are using a new function: rep_prefix. In fact this is an UDF which prints the string passed as parameter only once at the beginning inside a FORALL statement. But the interesting thing here is the FORALL application.

Finally, there's still one EL special function: COUNT. Due to certain special situations or strange input data in the variables, sometimes is useful to know how many values contain a variable or a field. So this function, simply takes a variable or field as argument and returns a string with the number of values that contains; if the value returned is 0, that means that no value is in the variable, what means that variable doesn't exist or there weren't any values mapped from the input.

Examples:

As this is the last example, let's do it a bit more complicated: Continuing with our very well known simple format, we want all the authors of the record appear if there are less than 10, in any other case we want only the first one to appear followed by the string "et al.". We'll also use a function called GT which returns a non-empty string if the first parameter is greater than the second one.
```
 
 format("TITLE") "/" 
 if(gt(count($100.a), "10")!="")
 { $100.a "et al." }
 else
 {
   forall($100.a)
   {
     link("AUTHOR_SEARCH", $100.a) 
      { "<a href=\""$link "\">"$100.a"</a>"}
   }
 }
 
```

diff --git a/modules/bibformat/lib/bibformat.py b/modules/bibformat/lib/bibformat.py index 6aaed7067..8da67ae70 100644 --- a/modules/bibformat/lib/bibformat.py +++ b/modules/bibformat/lib/bibformat.py @@ -1,278 +1,285 @@ # -*- coding: utf-8 -*- ## $Id$ ## BibFormat. Format records using specified format. ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Format records using specified format. API functions: format_record, format_records, create_excel, get_output_format_content_type Used to wrap the BibFormat engine and associated functions. This is also where special formatting of multiple records (that the engine does not handle, as it works on a single record basis) should be put, with name create_*. SEE: bibformat_utils.py FIXME: currently copies record_exists() code from search engine. Refactor later. """ import zlib from invenio import bibformat_dblayer from invenio import bibformat_engine from invenio import bibformat_utils from invenio.config import cdslang, weburl, php from invenio.bibformat_config import use_old_bibformat try: import invenio.template websearch_templates = invenio.template.load('websearch') except: pass # Functions to format a single record ## def format_record(recID, of, ln=cdslang, verbose=0, search_pattern=[], xml_record=None, uid=None): """ Formats a record given output format. Returns a formatted version of the record in the specified language, search pattern, and with the specified output format. The function will define which format template must be applied. The record to be formatted can be specified with its ID (with 'recID' parameter) or given as XML representation(with 'xml_record' parameter). If both are specified 'recID' is ignored. 'uid' allows to grant access to some functionalities on a page depending on the user's priviledges. Typically use webuser.getUid(req). This uid has sense only in the case of on-the-fly formatting. @param recID the ID of record to format @param of an output format code (or short identifier for the output format) @param ln the language to use to format the record @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, stop if error in format elements 9: errors and warnings, stop if error (debug mode )) @param search_pattern list of strings representing the user request in web interface @param xml_record an xml string represention of the record to format @param uid the user id of the person who will view the formatted page (if applicable) @return formatted record """ ############### FIXME: REMOVE WHEN MIGRATION IS DONE ############### if use_old_bibformat and php: return bibformat_engine.call_old_bibformat(recID, format=of) ############################# END ################################## + return bibformat_engine.format_record(recID=recID, + of=of, + ln=ln, + verbose=verbose, + search_pattern=search_pattern, + xml_record=xml_record, + uid=uid) try: return bibformat_engine.format_record(recID=recID, of=of, ln=ln, verbose=verbose, search_pattern=search_pattern, xml_record=xml_record, uid=uid) except: #Failsafe execution mode if of == 'hd': return websearch_templates.tmpl_print_record_detailed( ln = ln, recID = recID, weburl = weburl, ) return websearch_templates.tmpl_print_record_brief(ln = ln, recID = recID, weburl = weburl, ) def record_get_xml(recID, format='xm', decompress=zlib.decompress): """ Returns an XML string of the record given by recID. The function builds the XML directly from the database, without using the standard formatting process. 'format' allows to define the flavour of XML: - 'xm' for standard XML - 'marcxml' for MARC XML - 'oai_dc' for OAI Dublin Core - 'xd' for XML Dublin Core If record does not exist, returns empty string. @param recID the id of the record to retrieve @return the xml string of the record """ return bibformat_utils.record_get_xml(recID=recID, format=format) # Helper functions to do complex formatting of multiple records # # You should not modify format_records when adding a complex # formatting of multiple records, but add a create_* method # that relies on format_records to do the formatting. ## def format_records(recIDs, of, ln=cdslang, verbose=0, search_pattern=None, xml_records=None, uid=None, prefix=None, separator=None, suffix=None, req=None): """ Returns a list of formatted records given by a list of record IDs or a list of records as xml. Adds a prefix before each record, a suffix after each record, plus a separator between records. You can either specify a list of record IDs to format, or a list of xml records, but not both (if both are specified recIDs is ignored). 'separator' is a function that returns a string as separator between records. The function must take an integer as unique parameter, which is the index in recIDs (or xml_records) of the record that has just been formatted. For example separator(i) must return the separator between recID[i] and recID[i+1]. Alternatively separator can be a single string, which will be used to separate all formatted records. 'req' is an optional parameter on which the result of the function are printed lively (prints records after records) if it is given. This function takes the same parameters as 'format_record' except for: @param recIDs a list of record IDs @param xml_records a list of xml string representions of the records to format @param header a string printed before all formatted records @param separator either a string or a function that returns string to separate formatted records @param req an optional request object where to print records """ formatted_records = '' #Fill one of the lists with Nones if xml_records != None: recIDs = map(lambda x:None, xml_records) else: xml_records = map(lambda x:None, recIDs) total_rec = len(recIDs) last_iteration = False for i in range(total_rec): if i == total_rec - 1: last_iteration = True #Print prefix if prefix != None: if isinstance(prefix, str): formatted_records += prefix if req != None: req.write(prefix) else: string_prefix = prefix(i) formatted_records += string_prefix if req != None: req.write(string_prefix) #Print formatted record formatted_record = format_record(recIDs[i], of, ln, verbose, search_pattern, xml_records[i], uid) formatted_records += formatted_record if req != None: req.write(formatted_record) #Print suffix if suffix != None: if isinstance(suffix, str): formatted_records += suffix if req != None: req.write(suffix) else: string_suffix = suffix(i) formatted_records += string_suffix if req != None: req.write(string_suffix) #Print separator if needed if separator != None and not last_iteration: if isinstance(separator, str): formatted_records += separator if req != None: req.write(separator) else: string_separator = separator(i) formatted_records += string_separator if req != None: req.write(string_separator) return formatted_records def create_excel(recIDs, req=None, ln=cdslang): """ Returns an Excel readable format containing the given recIDs. If 'req' is given, also prints the output in 'req' while individual records are being formatted. This method shows how to create a custom formatting of multiple records. The excel format is a basic HTML table that most spreadsheets applications can parse. @param recIDs a list of record IDs @return a string in Excel format """ # Prepare the column headers to display in the Excel file column_headers_list = ['Title', 'Authors', 'Addresses', 'Affiliation', 'Date', 'Publisher', 'Place', 'Abstract', 'Keywords', 'Notes'] # Prepare Content column_headers = ''.join(column_headers_list) + '' column_headers = '\n'+ '' footer = '
' + column_headers + '
' #Apply content_type and print column headers if req != None: req.content_type = get_output_format_content_type('excel') req.headers_out["Content-Disposition"] = "inline; filename=%s" % 'results.xls' req.send_http_header() req.write(column_headers) #Format the records excel_formatted_records = format_records(recIDs, 'excel', ln=cdslang, separator='\n', req=req) if req != None: req.write(footer) return column_headers + excel_formatted_records + footer # Utility functions ## def get_output_format_content_type(of): """ Returns the content type (eg. 'text/html' or 'application/ms-excel') \ of the given output format. @param of the code of output format for which we want to get the content type """ content_type = bibformat_dblayer.get_output_format_content_type(of) if content_type == '': content_type = 'text/html' return content_type diff --git a/modules/bibformat/lib/bibformat_engine.py b/modules/bibformat/lib/bibformat_engine.py index d1a08436e..b866b1d8c 100644 --- a/modules/bibformat/lib/bibformat_engine.py +++ b/modules/bibformat/lib/bibformat_engine.py @@ -1,1603 +1,1606 @@ # -*- coding: utf-8 -*- ## $Id$ ## Bibformt engine. Format XML Marc record using specified format. ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Formats a single XML Marc record using specified format. There is no API for the engine. Instead use bibformat.py. SEE: bibformat.py, bibformat_utils.py """ import re import sys import os import inspect import traceback import zlib from invenio.errorlib import register_errors, get_msgs_for_code_list from invenio.config import * from invenio.bibrecord import create_record, record_get_field_instances, record_get_field_value, record_get_field_values from invenio.dbquery import run_sql from invenio.messages import language_list_long, wash_language from invenio import bibformat_dblayer from invenio.bibformat_config import format_template_extension, format_output_extension, templates_path, elements_path, outputs_path, elements_import_path from bibformat_utils import record_get_xml +from xml.dom import minidom #Remove when call_old_bibformat is removed + __lastupdated__ = """$Date$""" #Cache for data we have allready read and parsed format_templates_cache = {} format_elements_cache = {} format_outputs_cache = {} kb_mappings_cache = {} cdslangs = language_list_long() #Regular expression for finding ... tag in format templates pattern_lang = re.compile(r''' #closing start tag (?P.*?) #anything but the next group (greedy) () #end tag ''', re.IGNORECASE | re.DOTALL | re.VERBOSE) #Builds regular expression for finding each known language in tags ln_pattern_text = r"<(" for lang in cdslangs: ln_pattern_text += lang[0] +r"|" ln_pattern_text = ln_pattern_text.rstrip(r"|") ln_pattern_text += r")>(.*?)" ln_pattern = re.compile(ln_pattern_text) #Regular expression for finding tag in format templates pattern_format_template_name = re.compile(r''' #closing start tag (?P.*?) #name value. any char that is not end tag ()(\n)? #end tag ''', re.IGNORECASE | re.DOTALL | re.VERBOSE) #Regular expression for finding tag in format templates pattern_format_template_desc = re.compile(r''' #closing start tag (?P.*?) #description value. any char that is not end tag (\n)? #end tag ''', re.IGNORECASE | re.DOTALL | re.VERBOSE) #Regular expression for finding tags in format templates pattern_tag = re.compile(r''' [^/\s]+) #any char but a space or slash \s* #any number of spaces (?P(\s* #params here (?P([^=\s])*)\s* #param name: any chars that is not a white space or equality. Followed by space(s) =\s* #equality: = followed by any number of spaces (?P[\'"]) #one of the separators (?P.*?) #param value: any chars that is not a separator like previous one (?P=sep) #same separator as starting one )*) #many params \s* #any number of spaces (/)?> #end of the tag ''', re.IGNORECASE | re.DOTALL | re.VERBOSE) #Regular expression for finding params inside tags in format templates pattern_function_params = re.compile(''' (?P([^=\s])*)\s* # Param name: any chars that is not a white space or equality. Followed by space(s) =\s* # Equality: = followed by any number of spaces (?P[\'"]) # One of the separators (?P.*?) # Param value: any chars that is not a separator like previous one (?P=sep) # Same separator as starting one ''', re.VERBOSE | re.DOTALL ) #Regular expression for finding format elements "params" attributes (defined by @param) pattern_format_element_params = re.compile(''' @param\s* # Begins with @param keyword followed by space(s) (?P[^\s=]*)\s* # A single keyword, and then space(s) #(=\s*(?P[\'"]) # Equality, space(s) and then one of the separators #(?P.*?) # Default value: any chars that is not a separator like previous one #(?P=sep) # Same separator as starting one #)?\s* # Default value for param is optional. Followed by space(s) (?P.*) # Any text that is not end of line (thanks to MULTILINE parameter) ''', re.VERBOSE | re.MULTILINE) #Regular expression for finding format elements "see also" attribute (defined by @see) pattern_format_element_seealso = re.compile('''@see\s*(?P.*)''', re.VERBOSE | re.MULTILINE) #Regular expression for finding 2 expressions in quotes, separated by comma (as in template("1st","2nd") ) #Used when parsing output formats ## pattern_parse_tuple_in_quotes = re.compile(''' ## (?P[\'"]) ## (?P.*) ## (?P=sep1) ## \s*,\s* ## (?P[\'"]) ## (?P.*) ## (?P=sep2) ## ''', re.VERBOSE | re.MULTILINE) def call_old_bibformat(recID, format="HD"): """ FIXME: REMOVE FUNCTION WHEN MIGRATION IS DONE Calls BibFormat for the record RECID in the desired output format FORMAT. Note: this functions always try to return HTML, so when bibformat returns XML with embedded HTML format inside the tag FMT $g, as is suitable for prestoring output formats, we perform un-XML-izing here in order to return HTML body only. """ # look for formatted notice existence: - query = "SELECT value FROM bibfmt WHERE id_bibrec='%s' AND format='%s'" % (recID, of) + query = "SELECT value FROM bibfmt WHERE id_bibrec='%s' AND format='%s'" % (recID, format) res = run_sql(query, None, 1) if res: # record 'recID' is formatted in 'format', so print it decompress = zlib.decompress return "%s" % decompress(res[0][0]) else: # record 'recID' is not formatted in 'format', so try to call BibFormat on the fly or use default format: out = "" pipe_input, pipe_output, pipe_error = os.popen3(["%s/bibformat" % bindir, "otype=%s" % format], 'rw') #pipe_input.write(print_record(recID, "xm")) pipe_input.write(record_get_xml(recID, "xm")) pipe_input.close() bibformat_output = pipe_output.read() pipe_output.close() pipe_error.close() if bibformat_output.startswith(""): dom = minidom.parseString(bibformat_output) for e in dom.getElementsByTagName('subfield'): if e.getAttribute('code') == 'g': for t in e.childNodes: out += t.data.encode('utf-8') else: out = bibformat_output return out def format_record(recID, of, ln=cdslang, verbose=0, search_pattern=[], xml_record=None, uid=None): """ Formats a record given output format. Main entry function of bibformat engine. Returns a formatted version of the record in the specified language, search pattern, and with the specified output format. The function will define which format template must be applied. You can either specify an record ID to format, or give its xml representation. if 'xml_record' != None, then use it instead of recID. 'uid' allows to grant access to some functionalities on a page depending on the user's priviledges. @param recID the ID of record to format @param of an output format code (or short identifier for the output format) @param ln the language to use to format the record @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, stop if error in format elements 9: errors and warnings, stop if error (debug mode )) @param search_pattern list of strings representing the user request in web interface @param xml_record an xml string representing the record to format @param uid the user id of the person who will view the formatted page @return formatted record """ errors_ = [] # Temporary workflow (during migration of formats): # Call new BibFormat # But if format not found for new BibFormat, then call old BibFormat #Create a BibFormat Object to pass that contain record and context bfo = BibFormatObject(recID, ln, search_pattern, xml_record, uid) #Find out which format template to use based on record and output format. template = decide_format_template(bfo, of) - if template == None: - ############### FIXME: REMOVE WHEN MIGRATION IS DONE ############### + ############### FIXME: REMOVE WHEN MIGRATION IS DONE ############### + path = "%s%s%s" % (templates_path, os.sep, template) + if template == None or not os.access(path, os.R_OK): # template not found in new BibFormat. Call old one if php: return call_old_bibformat(recID, format=of) - ############################# END ################################## + ############################# END ################################## error = get_msgs_for_code_list([("ERR_BIBFORMAT_NO_TEMPLATE_FOUND", of)], file='error', ln=cdslang) errors_.append(error) if verbose == 0: register_errors(error, 'error') elif verbose > 5: return error[0][1] return "" #Format with template (out, errors) = format_with_format_template(template, bfo, verbose) errors_.extend(errors) return out def decide_format_template(bfo, of): """ Returns the format template name that should be used for formatting given output format and BibFormatObject. Look at of rules, and take the first matching one. If no rule matches, returns None To match we ignore lettercase and spaces before and after value of rule and value of record @param bfo a BibFormatObject @param of the code of the output format to use """ output_format = get_output_format(of) for rule in output_format['rules']: value = bfo.field(rule['field']).strip()#Remove spaces pattern = rule['value'].strip() #Remove spaces if re.match(pattern, value, re.IGNORECASE) != None: return rule['template'] template = output_format['default'] if template != '': return template else: return None def format_with_format_template(format_template_filename, bfo, verbose=0, format_template_code=None): """ Format a record given a format template. Also returns errors Returns a formatted version of the record represented by bfo, in the language specified in bfo, and with the specified format template. Parameter format_template_filename will be ignored if format_template_code is provided. This allows to preview format code without having to save file on disk @param format_template_filename the dilename of a format template @param bfo the object containing parameters for the current formatting @param format_template_code if not empty, use code as template instead of reading format_template_filename (used for previews) @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, 9: errors and warnings, stop if error (debug mode )) @return tuple (formatted text, errors) """ errors_ = [] if format_template_code != None: format_content = str(format_template_code) else: format_content = get_format_template(format_template_filename)['code'] localized_format = filter_languages(format_content, bfo.lang) (evaluated_format, errors) = eval_format_template_elements(localized_format, bfo, verbose) errors_ = errors return (evaluated_format, errors) def eval_format_template_elements(format_template, bfo, verbose=0): """ Evalutes the format elements of the given template and replace each element with its value. Also returns errors. Prepare the format template content so that we can directly replace the marc code by their value. This implies: 1) Look for special tags 2) replace special tags by their evaluation @param format_template the format template code @param bfo the object containing parameters for the current formatting @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, 9: errors and warnings, stop if error (debug mode )) @return tuple (result, errors) """ errors_ = [] #First define insert_element_code(match), used in re.sub() function def insert_element_code(match): """ Analyses 'match', interpret the corresponding code, and return the result of the evaluation. Called by substitution in 'eval_format_template_elements(...)' @param match a match object corresponding to the special tag that must be interpreted """ function_name = match.group("function_name") format_element = get_format_element(function_name, verbose) params = {} #look for function parameters given in format template code all_params = match.group('params') if all_params != None: function_params_iterator = pattern_function_params.finditer(all_params) for param_match in function_params_iterator: name = param_match.group('param') value = param_match.group('value') params[name] = value #Evaluate element with params and return (Do not return errors) (result, errors) = eval_format_element(format_element, bfo, params, verbose) errors_ = errors return result #Substitute special tags in the format by our own text. #Special tags have the form format = pattern_tag.sub(insert_element_code, format_template) return (format, errors_) def eval_format_element(format_element, bfo, parameters={}, verbose=0): """ Returns the result of the evaluation of the given format element name, with given BibFormatObject and parameters. Also returns the errors of the evaluation. @param format_element a format element structure as returned by get_format_element @param bfo a BibFormatObject used for formatting @param parameters a dict of parameters to be used for formatting. Key is parameter and value is value of parameter @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, 9: errors and warnings, stop if error (debug mode )) @return tuple (result, errors) """ errors = [] #Load special values given as parameters prefix = parameters.get('prefix', "") suffix = parameters.get('suffix', "") default_value = parameters.get('default', "") #3 possible cases: #a) format element file is found: we execute it #b) format element file is not found, but exist in tag table (e.g. bfe_isbn) #c) format element is totally unknown. Do nothing or report error if format_element != None and format_element['type'] == "python": #a) #We found an element with the tag name, of type "python" #Prepare a dict 'params' to pass as parameter to 'format' function of element params = {} #look for parameters defined in format element #fill them with specified default values and values #given as parameters for param in format_element['attrs']['params']: name = param['name'] default = param['default'] params[name] = parameters.get(name, default) #Add BibFormatObject params['bfo'] = bfo #execute function with given parameters and return result. output_text = "" function = format_element['code'] output_text = apply(function, (), params) try: output_text = apply(function, (), params) except Exception, e: output_text = "" name = format_element['attrs']['name'] error = ("ERR_BIBFORMAT_EVALUATING_ELEMENT", name, str(params)) errors.append(error) if verbose == 0: register_errors(errors, 'error') elif verbose >=5: tb = sys.exc_info()[2] error_string = get_msgs_for_code_list(error, file='error', ln=cdslang) stack = traceback.format_exception(Exception, e, tb, limit=None) output_text = ''+error_string[0][1] + "".join(stack) +' ' if output_text == None: output_text = "" else: output_text = str(output_text) #Add prefix and suffix if they have been given as parameters and if #the evaluation of element is not empty if output_text.strip() != "": output_text = prefix + output_text + suffix #Add the default value if output_text is empty if output_text == "": output_text = default_value return (output_text, errors) elif format_element != None and format_element['type'] =="field": #b) #We have not found an element in files that has the tag name. Then look for it #in the table "tag" # # # #Load special values given as parameters separator = parameters.get('separator ', "") nbMax = parameters.get('nbMax', "") #Get the fields tags that have to be printed tags = format_element['attrs']['tags'] output_text = [] #Get values corresponding to tags for tag in tags: values = bfo.fields(tag)#Retrieve record values for tag if len(values)>0 and isinstance(values[0], dict):#flatten dict to its values only values_list = map(lambda x: x.values(), values) #output_text.extend(values) for values in values_list: output_text.extend(values) else: output_text.extend(values) if nbMax != "": try: nbMax = int(nbMax) output_text = output_text[:nbMax] except: name = format_element['attrs']['name'] error = ("ERR_BIBFORMAT_NBMAX_NOT_INT", name) errors.append(error) if verbose < 5: register_errors(error, 'error') elif verbose >=5: error_string = get_msgs_for_code_list(error, file='error', ln=cdslang) output_text = output_text.append(error_string[0][1]) #Add prefix and suffix if they have been given as parameters and if #the evaluation of element is not empty. #If evaluation is empty string, return default value if it exists. Else return empty string if ("".join(output_text)).strip() != "": return (prefix + separator.join(output_text) + suffix, errors) else: #Return default value return (default_value, errors) else: #c) Element is unknown error = get_msgs_for_code_list([("ERR_BIBFORMAT_CANNOT_RESOLVE_ELEMENT_NAME", format_element)], file='error', ln=cdslang) errors.append(error) if verbose < 5: register_errors(error, 'error') return ("", errors) elif verbose >=5: if verbose >= 9: sys.exit(error[0][1]) return (''+error[0][1]+'', errors) def filter_languages(format_template, ln='en'): """ Filters the language tags that do not correspond to the specified language. @param format_template the format template code @param ln the language that is NOT filtered out from the template @return the format template with unnecessary languages filtered out """ #First define search_lang_tag(match) and clean_language_tag(match), used #in re.sub() function def search_lang_tag(match): """ Searches for the ... tag and remove inner localized tags such as , , that are not current_lang. If current_lang cannot be found inside ... , try to use 'cdslang' @param match a match object corresponding to the special tag that must be interpreted """ current_lang = ln def clean_language_tag(match): """ Return tag text content if tag language of match is output language. Called by substitution in 'filter_languages(...)' @param match a match object corresponding to the special tag that must be interpreted """ if match.group(1) == current_lang: return match.group(2) else: return "" #End of clean_language_tag lang_tag_content = match.group("langs") #Try to find tag with current lang. If it does not exists, then current_lang #becomes cdslang until the end of this replace pattern_current_lang = re.compile(r"<"+current_lang+"\s*>(.*?)") if re.search(pattern_current_lang, lang_tag_content) == None: current_lang = cdslang cleaned_lang_tag = ln_pattern.sub(clean_language_tag, lang_tag_content) return cleaned_lang_tag #End of search_lang_tag filtered_format_template = pattern_lang.sub(search_lang_tag, format_template) return filtered_format_template def parse_tag(tag): """ Parse a marc code and decompose it in a table with: 0-tag 1-indicator1 2-indicator2 3-subfield The first 3 chars always correspond to tag. The indicators are optional. However they must both be indicated, or both ommitted. If indicators are ommitted or indicated with underscore '_', they mean "No indicator". The subfield is optional. It can optionally be preceded by a dot '.' or '$$' or '$' Any of the chars can be replaced by wildcard % THE FUNCTION DOES NOT CHECK WELLFORMNESS OF 'tag' Any empty chars is not considered For example: >> parse_tag('245COc') = ['245', 'C', 'O', 'c'] >> parse_tag('245C_c') = ['245', 'C', '', 'c'] >> parse_tag('245__c') = ['245', '', '', 'c'] >> parse_tag('245__$$c') = ['245', '', '', 'c'] >> parse_tag('245__$c') = ['245', '', '', 'c'] >> parse_tag('245 $c') = ['245', '', '', 'c'] >> parse_tag('245 $$c') = ['245', '', '', 'c'] >> parse_tag('245__.c') = ['245', '', '', 'c'] >> parse_tag('245 .c') = ['245', '', '', 'c'] >> parse_tag('245C_$c') = ['245', 'C', '', 'c'] >> parse_tag('245CO$$c') = ['245', 'C', 'O', 'c'] >> parse_tag('245C_.c') = ['245', 'C', '', 'c'] >> parse_tag('245$c') = ['245', '', '', 'c'] >> parse_tag('245.c') = ['245', '', '', 'c'] >> parse_tag('245$$c') = ['245', '', '', 'c'] >> parse_tag('245__%') = ['245', '', '', ''] >> parse_tag('245__$$%') = ['245', '', '', ''] >> parse_tag('245__$%') = ['245', '', '', ''] >> parse_tag('245 $%') = ['245', '', '', ''] >> parse_tag('245 $$%') = ['245', '', '', ''] >> parse_tag('245$%') = ['245', '', '', ''] >> parse_tag('245.%') = ['245', '', '', ''] >> parse_tag('245$$%') = ['245', '', '', ''] >> parse_tag('2%5$$a') = ['2%5', '', '', 'a'] """ p_tag = ['', '', '', ''] tag = tag.replace(" ", "") #Remove empty characters tag = tag.replace("$", "") #Remove $ characters tag = tag.replace(".", "") #Remove . characters #tag = tag.replace("_", "") #Remove _ characters p_tag[0] = tag[0:3] #tag if len(tag) == 4: p_tag[3] = tag[3] #subfield elif len(tag) == 5: ind1 = tag[3]#indicator 1 if ind1 != "_": p_tag[1] = ind1 ind2 = tag[4]#indicator 2 if ind2 != "_": p_tag[2] = ind2 elif len(tag) == 6: p_tag[3] = tag[5]#subfield ind1 = tag[3]#indicator 1 if ind1 != "_": p_tag[1] = ind1 ind2 = tag[4]#indicator 2 if ind2 != "_": p_tag[2] = ind2 return p_tag def get_format_template(filename, with_attributes=False): """ Returns the structured content of the given formate template. if 'with_attributes' is True, returns the name and description. Else 'attrs' is not returned as key in dictionary (it might, if it has already been loaded previously) {'code':"Some template code" 'attrs': {'name': "a name", 'description': "a description"} } @param filename the filename of an format template @param with_attributes if True, fetch the attributes (names and description) for format' @return strucured content of format template """ #Get from cache whenever possible global format_templates_cache if not filename.endswith("."+format_template_extension): return None if format_templates_cache.has_key(filename): #If we must return with attributes and template exist in cache with attributes #then return cache. Else reload with attributes if with_attributes == True and format_templates_cache[filename].has_key('attrs'): return format_templates_cache[filename] format_template = {'code':""} try: path = "%s%s%s" % (templates_path, os.sep, filename) format_file = open(path) format_content = format_file.read() format_file.close() #Load format template code #Remove name and description code_and_description = pattern_format_template_name.sub("", format_content) code = pattern_format_template_desc.sub("", code_and_description) # Escape % chars in code (because we will use python formatting capabilities) format_template['code'] = code except Exception, e: errors = get_msgs_for_code_list([("ERR_BIBFORMAT_CANNOT_READ_TEMPLATE_FILE", filename, str(e))], file='error', ln=cdslang) register_errors(errors, 'error') #Save attributes if necessary if with_attributes: format_template['attrs'] = get_format_template_attrs(filename) #cache and return format_templates_cache[filename] = format_template return format_template def get_format_templates(with_attributes=False): """ Returns the list of all format templates if 'with_attributes' is True, returns the name and description. Else 'attrs' is not returned as key in each dictionary (it might, if it has already been loaded previously) [{'code':"Some template code" 'attrs': {'name': "a name", 'description': "a description"} }, ... } @param with_attributes if True, fetch the attributes (names and description) for formats """ format_templates = {} files = os.listdir(templates_path) for filename in files: if filename.endswith("."+format_template_extension): format_templates[filename] = get_format_template(filename, with_attributes) return format_templates def get_format_template_attrs(filename): """ Returns the attributes of the format template with given filename The attributes are {'name', 'description'} Caution: the function does not check that path exists or that the format element is valid. @param the path to a format element """ attrs = {} attrs['name'] = "" attrs['description'] = "" try: template_file = open("%s%s%s"%(templates_path, os.sep, filename)) code = template_file.read() template_file.close() match = pattern_format_template_name.search(code) if match != None: attrs['name'] = match.group('name') else: attrs['name'] = filename match = pattern_format_template_desc.search(code) if match != None: attrs['description'] = match.group('desc').rstrip('.') except Exception, e: errors = get_msgs_for_code_list([("ERR_BIBFORMAT_CANNOT_READ_TEMPLATE_FILE", filename, str(e))], file='error', ln=cdslang) register_errors(errors, 'error') attrs['name'] = filename return attrs def get_format_element(element_name, verbose=0, with_built_in_params=False): """ Returns the format element structured content. Return None if element cannot be loaded (file not found, not readable or invalid) The returned structure is {'attrs': {some attributes in dict. See get_format_element_attrs_from_*} 'code': the_function_code, 'type':"field" or "python" depending if element is defined in file or table} @param element_name the name of the format element to load @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, 9: errors and warnings, stop if error (debug mode )) @param with_built_in_params if True, load the parameters built in all elements @return a dictionary with format element attributes """ #Get from cache whenever possible global format_elements_cache #Resolve filename and prepare 'name' as key for the cache filename = resolve_format_element_filename(element_name) if filename != None: name = filename.upper() else: name = element_name.upper() if format_elements_cache.has_key(name): element = format_elements_cache[name] if with_built_in_params == False or (with_built_in_params == True and element['attrs'].has_key('builtin_params') ): return element if filename == None: #element is maybe in tag table if bibformat_dblayer.tag_exists_for_name(element_name): format_element = {'attrs': get_format_element_attrs_from_table(element_name, with_built_in_params), 'code':None, 'type':"field"} #Cache and returns format_elements_cache[name] = format_element return format_element else: errors = get_msgs_for_code_list([("ERR_BIBFORMAT_FORMAT_ELEMENT_NOT_FOUND", element_name)], file='error', ln=cdslang) if verbose == 0: register_errors(errors, 'error') elif verbose >=5: sys.stderr.write(errors[0][1]) return None else: format_element = {} module_name = filename if module_name.endswith(".py"): module_name = module_name[:-3] try: module = __import__(elements_import_path+"."+module_name) #Load last module in import path #For eg. load bibformat_elements in invenio.elements.bibformat_element #Used to keep flexibility regarding where elements directory is (for eg. test cases) components = elements_import_path.split(".") for comp in components[1:]: module = getattr(module, comp) function_format = module.__dict__[module_name].format format_element['code'] = function_format format_element['attrs'] = get_format_element_attrs_from_function(function_format, element_name, with_built_in_params) format_element['type'] = "python" #cache and return format_elements_cache[name] = format_element return format_element except Exception, e: errors = get_msgs_for_code_list([("ERR_BIBFORMAT_FORMAT_ELEMENT_NOT_FOUND", element_name)], file='error', ln=cdslang) if verbose == 0: register_errors(errors, 'error') elif verbose >= 5: sys.stderr.write(str(e)) sys.stderr.write(errors[0][1]) if verbose >= 7: raise e return None def get_format_elements(with_built_in_params=False): """ Returns the list of format elements attributes as dictionary structure Elements declared in files have priority over element declared in 'tag' table The returned object has this format: {element_name1: {'attrs': {'description':..., 'seealso':... 'params':[{'name':..., 'default':..., 'description':...}, ...] 'builtin_params':[{'name':..., 'default':..., 'description':...}, ...] }, 'code': code_of_the_element }, element_name2: {...}, ...} Returns only elements that could be loaded (not error in code) @return a dict of format elements with name as key, and a dict as attributes @param with_built_in_params if True, load the parameters built in all elements """ format_elements = {} mappings = bibformat_dblayer.get_all_name_tag_mappings() for name in mappings: format_elements[name.upper().replace(" ", "_").strip()] = get_format_element(name, with_built_in_params=with_built_in_params) files = os.listdir(elements_path) for filename in files: filename_test = filename.upper().replace(" ", "_") if filename_test.endswith(".PY") and filename != "__INIT__.PY": if filename_test.startswith("BFE_"): filename_test = filename_test[4:] element_name = filename_test[:-3] element = get_format_element(element_name, with_built_in_params=with_built_in_params) if element != None: format_elements[element_name] = element return format_elements def get_format_element_attrs_from_function(function, element_name, with_built_in_params=False): """ Returns the attributes of the function given as parameter. It looks for standard parameters of the function, default values and comments in the docstring. The attributes are {'description', 'seealso':['element.py', ...], 'params':{name:{'name', 'default', 'description'}, ...], name2:{}} The attributes are {'name' : "name of element" #basically the name of 'name' parameter 'description': "a string description of the element", 'seealso' : ["element_1.py", "element_2.py", ...] #a list of related elements 'params': [{'name':"param_name", #a list of parameters for this element (except 'bfo') 'default':"default value", 'description': "a description"}, ...], 'builtin_params': {name: {'name':"param_name",#the parameters builtin for all elem of this kind 'default':"default value", 'description': "a description"}, ...}, } @param function the formatting function of a format element @param element_name the name of the element @param with_built_in_params if True, load the parameters built in all elements """ attrs = {} attrs['description'] = "" attrs['name'] = element_name.replace(" ", "_").upper() attrs['seealso'] = [] docstring = function.__doc__ if isinstance(docstring, str): #Look for function description in docstring #match = pattern_format_element_desc.search(docstring) description = docstring.split("@param")[0] description = description.split("@see")[0] attrs['description'] = description.strip().rstrip('.') #Look for @see in docstring match = pattern_format_element_seealso.search(docstring) if match != None: elements = match.group('see').rstrip('.').split(",") for element in elements: attrs['seealso'].append(element.strip()) params = {} #Look for parameters in function definition (args, varargs, varkw, defaults) = inspect.getargspec(function) #Prepare args and defaults_list such that we can have a mapping from args to defaults args.reverse() if defaults != None: defaults_list = list(defaults) defaults_list.reverse() else: defaults_list = [] for arg, default in map(None, args, defaults_list): if arg == "bfo": continue #Don't keep this as parameter. It is hidden to users, and exists in all elements of this kind param = {} param['name'] = arg if default == None: param['default'] = "" #In case no check is made inside element, we prefer to print "" (nothing) than None in output else: param['default'] = default param['description'] = "(no description provided)" params[arg] = param if isinstance(docstring, str): #Look for @param descriptions in docstring. #Add description to existing parameters in params dict params_iterator = pattern_format_element_params.finditer(docstring) for match in params_iterator: name = match.group('name') if params.has_key(name): params[name]['description'] = match.group('desc').rstrip('.') attrs['params'] = params.values() #Load built-in parameters if necessary if with_built_in_params == True: builtin_params = [] #Add 'prefix' parameter param_prefix = {} param_prefix['name'] = "prefix" param_prefix['default'] = "" param_prefix['description'] = "A prefix printed only if the record has a value for this element" builtin_params.append(param_prefix) #Add 'suffix' parameter param_suffix = {} param_suffix['name'] = "suffix" param_suffix['default'] = "" param_suffix['description'] = "A suffix printed only if the record has a value for this element" builtin_params.append(param_suffix) #Add 'default' parameter param_default = {} param_default['name'] = "default" param_default['default'] = "" param_default['description'] = "A default value printed if the record has no value for this element" builtin_params.append(param_default) attrs['builtin_params'] = builtin_params return attrs def get_format_element_attrs_from_table(element_name, with_built_in_params=False): """ Returns the attributes of the format element with given name in 'tag' table. Returns None if element_name does not exist in tag table. The attributes are {'name' : "name of element" #basically the name of 'element_name' parameter 'description': "a string description of the element", 'seealso' : [] #a list of related elements. Always empty in this case 'params': [], #a list of parameters for this element. Always empty in this case 'builtin_params': [{'name':"param_name", #the parameters builtin for all elem of this kind 'default':"default value", 'description': "a description"}, ...], 'tags':["950.1", 203.a] #the list of tags printed by this element } @param element_name an element name in database @param element_name the name of the element @param with_built_in_params if True, load the parameters built in all elements """ attrs = {} tags = bibformat_dblayer.get_tags_from_name(element_name) field_label = "field" if len(tags)>1: field_label = "fields" attrs['description'] = "Prints %s %s of the record" % (field_label, ", ".join(tags)) attrs['name'] = element_name.replace(" ", "_").upper() attrs['seealso'] = [] attrs['params'] = [] attrs['tags'] = tags #Load built-in parameters if necessary if with_built_in_params == True: builtin_params = [] #Add 'prefix' parameter param_prefix = {} param_prefix['name'] = "prefix" param_prefix['default'] = "" param_prefix['description'] = "A prefix printed only if the record has a value for this element" builtin_params.append(param_prefix) #Add 'suffix' parameter param_suffix = {} param_suffix['name'] = "suffix" param_suffix['default'] = "" param_suffix['description'] = "A suffix printed only if the record has a value for this element" builtin_params.append(param_suffix) #Add 'separator' parameter param_separator = {} param_separator['name'] = "separator" param_separator['default'] = " " param_separator['description'] = "A separator between elements of the field" builtin_params.append(param_separator) #Add 'nbMax' parameter param_nbMax = {} param_nbMax['name'] = "nbMax" param_nbMax['default'] = "" param_nbMax['description'] = "The maximum number of values to print for this element. No limit if not specified" builtin_params.append(param_nbMax) #Add 'default' parameter param_default = {} param_default['name'] = "default" param_default['default'] = "" param_default['description'] = "A default value printed if the record has no value for this element" builtin_params.append(param_default) attrs['builtin_params'] = builtin_params return attrs def get_output_format(code, with_attributes=False, verbose=0): """ Returns the structured content of the given output format If 'with_attributes' is True, also returns the names and description of the output formats, else 'attrs' is not returned in dict (it might, if it has already been loaded previously). if output format corresponding to 'code' is not found return an empty structure. See get_output_format_attrs() to learn more on the attributes {'rules': [ {'field': "980__a", 'value': "PREPRINT", 'template': "filename_a.bft", }, {...} ], 'attrs': {'names': {'generic':"a name", 'sn':{'en': "a name", 'fr':"un nom"}, 'ln':{'en':"a long name"}} 'description': "a description" 'code': "fnm1", 'content_type': "application/ms-excel" } 'default':"filename_b.bft" } @param code the code of an output_format @param with_attributes if True, fetch the attributes (names and description) for format @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, 9: errors and warnings, stop if error (debug mode )) @return strucured content of output format """ output_format = {'rules':[], 'default':""} filename = resolve_output_format_filename(code, verbose) if filename == None: errors = get_msgs_for_code_list([("ERR_BIBFORMAT_OUTPUT_FORMAT_CODE_UNKNOWN", code)], file='error', ln=cdslang) register_errors(errors, 'error') if with_attributes == True: #Create empty attrs if asked for attributes output_format['attrs'] = get_output_format_attrs(code, verbose) return output_format #Get from cache whenever possible global format_outputs_cache if format_outputs_cache.has_key(filename): #If was must return with attributes but cache has not attributes, then load attributes if with_attributes == True and not format_outputs_cache[filename].has_key('attrs'): format_outputs_cache[filename]['attrs'] = get_output_format_attrs(code, verbose) return format_outputs_cache[filename] try: if with_attributes == True: output_format['attrs'] = get_output_format_attrs(code, verbose) path = "%s%s%s" % (outputs_path, os.sep, filename ) format_file = open(path) current_tag = '' for line in format_file: line = line.strip() if line == "": #ignore blank lines continue if line.endswith(":"): #retrieve tag clean_line = line.rstrip(": \n\r") #remove : spaces and eol at the end of line current_tag = "".join(clean_line.split()[1:]).strip() #the tag starts at second position elif line.find('---') != -1: words = line.split('---') template = words[-1].strip() condition = ''.join(words[:-1]) value = "" output_format['rules'].append({'field': current_tag, 'value': condition, 'template': template, }) elif line.find(':') != -1: #Default case default = line.split(':')[1].strip() output_format['default'] = default except Exception, e: errors = get_msgs_for_code_list([("ERR_BIBFORMAT_CANNOT_READ_OUTPUT_FILE", filename, str(e))], file='error', ln=cdslang) register_errors(errors, 'error') #cache and return format_outputs_cache[filename] = output_format return output_format def get_output_format_attrs(code, verbose=0): """ Returns the attributes of an output format. The attributes contain 'code', which is the short identifier of the output format (to be given as parameter in format_record function to specify the output format), 'description', a description of the output format, and 'names', the localized names of the output format. If 'content_type' is specified then the search_engine will send a file with this content type and with result of formatting as content to the user. The 'names' dict always contais 'generic', 'ln' (for long name) and 'sn' (for short names) keys. 'generic' is the default name for output format. 'ln' and 'sn' contain long and short localized names of the output format. Only the languages for which a localization exist are used. {'names': {'generic':"a name", 'sn':{'en': "a name", 'fr':"un nom"}, 'ln':{'en':"a long name"}} 'description': "a description" 'code': "fnm1", 'content_type': "application/ms-excel" } @param code the short identifier of the format @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, 9: errors and warnings, stop if error (debug mode )) @return strucured content of output format attributes """ if code.endswith("."+format_output_extension): code = code[:-(len(format_output_extension) + 1)] attrs = {'names':{'generic':"", 'ln':{}, 'sn':{}}, 'description':'', 'code':code.upper(), 'content_type':""} filename = resolve_output_format_filename(code, verbose) if filename == None: return attrs attrs['names'] = bibformat_dblayer.get_output_format_names(code) attrs['description'] = bibformat_dblayer.get_output_format_description(code) attrs['content_type'] = bibformat_dblayer.get_output_format_content_type(code) return attrs def get_output_formats(with_attributes=False): """ Returns the list of all output format, as a dictionary with their filename as key If 'with_attributes' is True, also returns the names and description of the output formats, else 'attrs' is not returned in dicts (it might, if it has already been loaded previously). See get_output_format_attrs() to learn more on the attributes {'filename_1.bfo': {'rules': [ {'field': "980__a", 'value': "PREPRINT", 'template': "filename_a.bft", }, {...} ], 'attrs': {'names': {'generic':"a name", 'sn':{'en': "a name", 'fr':"un nom"}, 'ln':{'en':"a long name"}} 'description': "a description" 'code': "fnm1" } 'default':"filename_b.bft" }, 'filename_2.bfo': {...}, ... } @return the list of output formats """ output_formats = {} files = os.listdir(outputs_path) for filename in files: if filename.endswith("."+format_output_extension): code = "".join(filename.split(".")[:-1]) output_formats[filename] = get_output_format(code, with_attributes) return output_formats def get_kb_mapping(kb, string, default=""): """ Returns the value of the string' in the knowledge base 'kb'. If kb does not exist or string does not exist in kb, returns 'default' string value. @param kb a knowledge base name @param string a key in a knowledge base @param default a default value if 'string' is not in 'kb' @return the value corresponding to the given string in given kb """ global kb_mappings_cache if kb_mappings_cache.has_key(kb): kb_cache = kb_mappings_cache[kb] if kb_cache.has_key(string): value = kb_mappings_cache[kb][string] if value == None: return default else: return value else: #Precreate for caching this kb kb_mappings_cache[kb] = {} value = bibformat_dblayer.get_kb_mapping_value(kb, string) kb_mappings_cache[kb][str(string)] = value if value == None: return default else: return value def resolve_format_element_filename(string): """ Returns the filename of element corresponding to string This is necessary since format templates code call elements by ignoring case, for eg. is the same as . It is also recommended that format elements filenames are prefixed with bfe_ . We need to look for these too. The name of the element has to start with "BFE_". @param name a name for a format element @return the corresponding filename, with right case """ if not string.endswith(".py"): name = string.replace(" ", "_").upper() +".PY" else: name = string.replace(" ", "_").upper() files = os.listdir(elements_path) for filename in files: test_filename = filename.replace(" ", "_").upper() if test_filename == name or \ test_filename == "BFE_" + name or \ "BFE_" + test_filename == name: return filename #No element with that name found #Do not log error, as it might be a normal execution case: #element can be in database return None def resolve_output_format_filename(code, verbose=0): """ Returns the filename of output corresponding to code This is necessary since output formats names are not case sensitive but most file systems are. @param code the code for an output format @param verbose the level of verbosity from 0 to 9 (O: silent, 5: errors, 7: errors and warnings, 9: errors and warnings, stop if error (debug mode )) @return the corresponding filename, with right case, or None if not found """ code = re.sub(r"[^.0-9a-zA-Z]", "", code) #Remove non alphanumeric chars (except .) if not code.endswith("."+format_output_extension): code = re.sub(r"\W", "", code) code += "."+format_output_extension files = os.listdir(outputs_path) for filename in files: if filename.upper() == code.upper(): return filename #No output format with that name found errors = get_msgs_for_code_list([("ERR_BIBFORMAT_CANNOT_RESOLVE_OUTPUT_NAME", code)], file='error', ln=cdslang) if verbose == 0: register_errors(errors, 'error') elif verbose >= 5: sys.stderr.write(errors[0][1]) if verbose >= 9: sys.exit(errors[0][1]) return None def get_fresh_format_template_filename(name): """ Returns a new filename and name for template with given name. Used when writing a new template to a file, so that the name has no space, is unique in template directory Returns (unique_filename, modified_name) @param a name for a format template @return the corresponding filename, and modified name if necessary """ #name = re.sub(r"\W", "", name) #Remove non alphanumeric chars name = name.replace(" ", "_") filename = name filename = re.sub(r"[^.0-9a-zA-Z]", "", filename) #Remove non alphanumeric chars (except .) path = templates_path + os.sep + filename + "." + format_template_extension index = 1 while os.path.exists(path): index += 1 filename = name + str(index) path = templates_path + os.sep + filename + "." + format_template_extension if index > 1: returned_name = (name + str(index)).replace("_", " ") else: returned_name = name.replace("_", " ") return (filename + "." + format_template_extension, returned_name) #filename.replace("_", " ")) def get_fresh_output_format_filename(code): """ Returns a new filename for output format with given code. Used when writing a new output format to a file, so that the code has no space, is unique in output format directory. The filename also need to be at most 6 chars long, as the convention is that filename == output format code (+ .extension) We return an uppercase code Returns (unique_filename, modified_code) @param code the code of an output format @return the corresponding filename, and modified code if necessary """ #code = re.sub(r"\W", "", code) #Remove non alphanumeric chars code = code.upper().replace(" ", "_") code = re.sub(r"[^.0-9a-zA-Z]", "", code) #Remove non alphanumeric chars (except .) if len(code) > 6: code = code[:6] filename = code path = outputs_path + os.sep + filename + "." + format_output_extension index = 2 while os.path.exists(path): filename = code + str(index) if len(filename) > 6: filename = code[:-(len(str(index)))]+str(index) index += 1 path = outputs_path + os.sep + filename + "." + format_output_extension #We should not try more than 99999... Well I don't see how we could get there.. Sanity check. if index >= 99999: errors = get_msgs_for_code_list([("ERR_BIBFORMAT_NB_OUTPUTS_LIMIT_REACHED", code)], file='error', ln=cdslang) register_errors(errors, 'error') sys.exit("Output format cannot be named as %s"%code) return (filename + "." + format_output_extension, filename) def clear_caches(): """ Clear the caches (Output Format, Format Templates and Format Elements) """ global format_templates_cache, format_elements_cache , format_outputs_cache, kb_mappings_cache format_templates_cache = {} format_elements_cache = {} format_outputs_cache = {} kb_mappings_cache = {} class BibFormatObject: """ An object that encapsulates a record and associated methods, and that is given as parameter to all format elements 'format' function. The object is made specifically for a given formatting, i.e. it includes for example the language for the formatting. The object provides basic accessors to the record. For full access, one can get the record with get_record() and then use BibRecord methods on the returned object. """ #The record record = None #The language in which the formatting has to be done lang = cdslang #A list of string describing the context in which the record has to be formatted. #It represents the words of the user request in web interface search search_pattern = [] #The id of the record recID = 0 #The user id of the person who will view the formatted page (if applicable) #This allows for example to print a "edit record" link for people #who have right to edit a record. uid = None def __init__(self, recID, ln=cdslang, search_pattern=[], xml_record=None, uid=None): """ Creates a new bibformat object, with given record. You can either specify an record ID to format, or give its xml representation. if 'xml_record' != None, use 'xml_record' instead of recID for the record. 'uid' allows to grant access to some functionalities on a page depending on the user's priviledges. @param recID the id of a record @param ln the language in which the record has to be formatted @param search_pattern list of string representing the request used by the user in web interface @param xml_record a xml string of the record to format @param uid the user id of the person who will view the formatted page """ if xml_record != None: #If record is given as parameter self.record = create_record(xml_record)[0] recID = record_get_field_value(self.record,"001") self.lang = wash_language(ln) self.search_pattern = search_pattern self.recID = recID self.uid = uid def get_record(self): """ Returns the record of this BibFormatObject instance @return the record structure as returned by BibRecord """ #Create record if necessary if self.record == None: record = create_record(record_get_xml(self.recID, 'xm')) self.record = record[0] return self.record def control_field(self, tag): """ Returns the value of control field given by tag in record @param record the record to retrieve values from @param tag the marc code of a field @return value of field tag in record """ if self.get_record() == None: #Case where BibRecord could not parse object return '' p_tag = parse_tag(tag) return record_get_field_value(self.get_record(), p_tag[0], p_tag[1], p_tag[2], p_tag[3]) def field(self, tag): """ Returns the value of the field corresponding to tag in the current record. if the value does not exist, return empty string @param record the record to retrieve values from @param tag the marc code of a field @return value of field tag in record """ list_of_fields = self.fields(tag) if len(list_of_fields) > 0: return list_of_fields[0] else: return "" def fields(self, tag): """ Returns the list of values corresonding to "tag". If tag has an undefined subcode (such as 999C5), the function returns a list of dictionaries, whoose keys are the subcodes and the values are the values of tag.subcode. If the tag has a subcode, simply returns list of values corresponding to tag. @param record the record to retrieve values from @param tag the marc code of a field @return values of field tag in record """ if self.get_record() == None: #Case where BibRecord could not parse object return [] p_tag = parse_tag(tag) if p_tag[3] != "": #Subcode has been defined. Simply returns list of values return record_get_field_values(self.get_record(), p_tag[0], p_tag[1], p_tag[2], p_tag[3]) else: #Subcode is undefined. Returns list of dicts. #However it might be the case of a control field. list_of_dicts = [] instances = record_get_field_instances(self.get_record(), p_tag[0], p_tag[1], p_tag[2]) for instance in instances: instance_dict = dict(instance[0]) list_of_dicts.append(instance_dict) return list_of_dicts def kb(self, kb, string, default=""): """ Returns the value of the "string" in the knowledge base "kb". If kb does not exist or string does not exist in kb, returns 'default' string or empty string if not specified. @param kb a knowledge base name @param string the string we want to translate @param default a default value returned if 'string' not found in 'kb' """ if string == None: return default val = get_kb_mapping(kb, string, default) if val == None: return default else: return val def bf_profile(): """ Runs a benchmark """ for i in range(50): format_record(i, "HD", ln=cdslang, verbose=9, search_pattern=[]) return if __name__ == "__main__": import profile import pstats bf_profile() profile.run('bf_profile()', "bibformat_profile") p = pstats.Stats("bibformat_profile") p.strip_dirs().sort_stats("cumulative").print_stats() diff --git a/modules/bibformat/lib/bibformat_templates.py b/modules/bibformat/lib/bibformat_templates.py index 5301ebb10..0be65bc43 100644 --- a/modules/bibformat/lib/bibformat_templates.py +++ b/modules/bibformat/lib/bibformat_templates.py @@ -1,2044 +1,2063 @@ # -*- coding: utf-8 -*- ## $Id$ ## Administration of BibFormat config files ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """HTML Templates for BibFormat administration""" __lastupdated__ = """$Date$""" # non Invenio imports import cgi # Invenio imports from invenio.messages import gettext_set_language from invenio.textutils import indent_text from invenio.config import weburl, sweburl from invenio.messages import language_list_long from invenio.config import php class Template: """Templating class, refer to bibformat.py for examples of call""" def tmpl_admin_index(self, ln, warnings, is_admin): """ Returns the main BibFormat admin page. @param ln language @param warnings a list of warnings to display at top of page. None if no warning @param is_admin indicate if user is authorized to use BibFormat @return main BibFormat admin page """ _ = gettext_set_language(ln) # load the right message language out = '' if warnings: out += '''

%(warnings)s

''' % {'warnings': '
'.join(warnings)} if php: # If PHP enabled, old bibformat can still run comment_on_php_admin_interface = '''
For some time the old BibFormat will still run along the new one, so that you can transition smoothly (See old Admin Interface further below).
''' out += '''

BibFormat has changed!
You will need to migrate your old formats if you are not a new user. You can read the documentation to learn how to write formats, or use the migration assistant.
%(comment_on_php_admin_interface)s

''' % {'weburl':weburl, 'comment_on_php_admin_interface':comment_on_php_admin_interface} out += '''
This is where you can edit the formatting styles available for the records. ''' if not is_admin: out += '''You need to login to enter. ''' % {'weburl':weburl} out += '''

Manage Format Templates

Define how to format a record.

Manage Output Formats

Define which template is applied to which record for a given output.

Manage Knowledge Bases

Define mappings of values, for standardizing records or declaring often used values.

Format Elements Documentation

Documentation of the format elements to be used inside format templates.

BibFormat Admin Guide

Documentation about BibFormat administration

'''% {'weburl':weburl, 'ln':ln} if php: #Show PHP admin only if PHP is enabled out += '''

Old BibFormat admin interface (in gray box)

The BibFormat admin interface enables you to specify how the bibliographic data is presented to the end user in the search interface and search results pages. For example, you may specify that titles should be printed in bold font, the abstract in small italic, etc. Moreover, the BibFormat is not only a simple bibliographic data output formatter, but also an automated link constructor. For example, from the information on journal name and pages, it may automatically create links to publisher's site based on some configuration rules.
Configuring BibFormat

By default, a simple HTML format based on the most common fields (title, author, abstract, keywords, fulltext link, etc) is defined. You certainly want to define your own ouput formats in case you have a specific metadata structure.
Here is a short guide of what you can configure:

Behaviours
Define one or more output BibFormat behaviours. These are then passed as parameters to the BibFormat modules while executing formatting.
Example: You can tell BibFormat that is has to enrich the incoming metadata file by the created format, or that it only has to print the format out.
Extraction Rules
Define how the metadata tags from input are mapped into internal BibFormat variable names. The variable names can afterwards be used in formatting and linking rules.
Example: You can tell that 100 $a field should be mapped into $100.a internal variable that you could use later.
Link Rules
Define rules for automated creation of URI links from mapped internal variables.
Example: You can tell a rule how to create a link to People database out of the $100.a internal variable repesenting author's name. (The $100.a variable was mapped in the previous step, see the Extraction Rules.)
File Formats
Define file format types based on file extensions. This will be used when proposing various fulltext services.
Example: You can tell that *.pdf files will be treated as PDF files.
User Defined Functions (UDFs)
Define your own functions that you can reuse when creating your own output formats. This enables you to do complex formatting without ever touching the BibFormat core code.
Example: You can define a function how to match and extract email addresses out of a text file.
Formats
Define the output formats, i.e. how to create the output out of internal BibFormat variables that were extracted in a previous step. This is the functionality you would want to configure most of the time. It may reuse formats, user defined functions, knowledge bases, etc.
Example: You can tell that authors should be printed in italic, that if there are more than 10 authors only the first three should be printed, etc.
Knowledge Bases (KBs)
Define one or more knowledge bases that enables you to transform various forms of input data values into the unique standard form on the output.
Example: You can tell that Phys Rev D and Physical Review D are both the same journal and that these names should be standardized to Phys Rev : D.
Execution Test
Enables you to test your formats on your sample data file. Useful when debugging newly created formats.

To learn more on BibFormat configuration, you can consult the BibFormat Admin Guide.
Running BibFormat

From the Web interface

Run Reformat Records tool. This tool permits you to update stored formats for bibliographic records.
It should normally be used after configuring BibFormat's Behaviours and Formats. When these are ready, you can choose to rebuild formats for selected collections or you can manually enter a search query and the web interface will accomplish all necessary formatting steps.
Example: You can request Photo collections to have their HTML brief formats rebuilt, or you can reformat all the records written by Ellis.
From the command-line interface

Consider having an XML MARC data file that is to be uploaded into the CDS Invenio. (For example, it might have been harvested from other sources and processed via BibConvert.) Having configured BibFormat and its default output type behaviour, you would then run this file throught BibFormat as follows:

$ bibformat < /tmp/sample.xml > /tmp/sample_with_fmt.xml

that would create default HTML formats and would "enrich" the input XML data file by this format. (You would then continue the upload procedure by calling successively BibUpload and BibWords.)
Now consider a different situation. You would like to add a new possible format, say "HTML portfolio" and "HTML captions" in order to nicely format multiple photographs in one page. Let us suppose that these two formats are called hp and hc and are already loaded in the collection_format table. (TODO: describe how this is done via WebAdmin.) You would then proceed as follows: firstly, you would prepare the corresponding output behaviours called HP and HC (TODO: note the uppercase!) that would not enrich the input file but that would produce an XML file with only 001 and FMT tags. (This is in order not to update the bibliographic information but the formats only.) You would also prepare corresponding formats at the same time. Secondly, you would launch the formatting as follows:

$ bibformat otype=HP,HC < /tmp/sample.xml > /tmp/sample_fmts_only.xml

that should give you an XML file containing only 001 and FMT tags. Finally, you would upload the formats:

$ bibupload < /tmp/sample_fmts_only.xml

and that's it. The new formats should now appear in WebSearch.
''' % {'weburl':weburl, 'ln':ln} return indent_text(out) def tmpl_admin_format_template_show_attributes(self, ln, name, description, filename, editable): """ Returns a page to change format template name and description @param ln language @param name the name of the format @param description the description of the format @param filename the filename of the template @param editable True if we let user edit, else False @return editor for 'format' """ _ = gettext_set_language(ln) # load the right message language out = "" out += '''

%(menu)s

0. %(close_editor)s 1. %(template_editor)s 2. %(modify_template_attributes)s 3. %(check_dependencies)s

''' % {'ln':ln, 'menu':_("Menu"), 'filename':filename, 'close_editor': _("Close Editor"), 'modify_template_attributes': _("Modify Template Attributes"), 'template_editor': _("Template Editor"), 'check_dependencies': _("Check Dependencies") } disabled = "" readonly = "" if not editable: disabled = 'disabled="disabled"' readonly = 'readonly="readonly"' out += '''

%(name)s attributes [?]

%(name_label)s:

%(description_label)s: %(description)s

''' % {"name": name, "description": description, 'ln':ln, 'filename':filename, 'disabled':disabled, 'readonly':readonly, 'description_label': _("Description"), 'name_label': _("Name"), 'update_format_attributes': _("Update Format Attributes"), 'weburl':weburl } return out def tmpl_admin_format_template_show_dependencies(self, ln, name, filename, output_formats, format_elements, tags): """ Shows the dependencies (on elements) of the given format. @param name the name of the template @param filename the filename of the template @param format_elements the elements (and list of tags in each element) this template depends on @param output_formats the output format that depend on this template @param tags the tags that are called by format elements this template depends on. """ _ = gettext_set_language(ln) # load the right message language out = '''

%(menu)s

0. %(close_editor)s 1. %(template_editor)s 2. %(modify_template_attributes)s 3. %(check_dependencies)s

Output Formats that use %(name)s Format Elements used by %(name)s* All Tags Called*

''' % {'ln':ln, 'filename':filename, 'menu': _("Menu"), 'close_editor': _("Close Editor"), 'modify_template_attributes': _("Modify Template Attributes"), 'template_editor': _("Template Editor"), 'check_dependencies': _("Check Dependencies"), 'name': name } #Print output formats if len(output_formats) == 0: out += '
No output format uses this format template.
' for output_format in output_formats: name = output_format['names']['generic'] filename = output_format['filename'] out += ''' %(name)s''' % {'filename':filename, 'name':name, 'ln':ln} if len(output_format['tags']) > 0: out += "("+", ".join(output_format['tags'])+")" out += "
" #Print format elements (and tags) out += '
' if len(format_elements) == 0: out += '
This format template uses no format element.
' for format_element in format_elements: name = format_element['name'] out += ''' %(name)s''' % {'name':"bfe_"+name.lower(), 'anchor':name.upper(), 'ln':ln} if len(format_element['tags']) > 0: out += "("+", ".join(format_element['tags'])+")" out += "
" #Print tags out += '
' if len(tags) == 0: out += '
This format template uses no tag.
' for tag in tags: out += '''%(tag)s
''' % { 'tag':tag} out += '''

*Note: Some tags linked with this format template might not be shown. Check manually. ''' return out def tmpl_admin_format_template_show(self, ln, name, description, code, filename, ln_for_preview, pattern_for_preview, editable, content_type_for_preview, content_types): """ Returns the editor for format templates. Edit 'format' @param ln language @param format the format to edit @param filename the filename of the template @param ln_for_preview the language for the preview (for bfo) @param pattern_for_preview the search pattern to be used for the preview (for bfo) @param editable True if we let user edit, else False @return editor for 'format' """ _ = gettext_set_language(ln) # load the right message language out = "" out += '''

%(menu)s

0. %(close_editor)s 1. %(template_editor)s 2. %(modify_template_attributes)s 3. %(check_dependencies)s

''' % {'ln': ln, 'filename': filename, 'menu': _("Menu"), 'label_show_doc': _("Show Documentation"), 'label_hide_doc': _("Hide Documentation"), 'close_editor': _("Close Editor"), 'modify_template_attributes': _("Modify Template Attributes"), 'template_editor': _("Template Editor"), 'check_dependencies': _("Check Dependencies"), 'weburl': sweburl or weburl } disabled = "" readonly = "" toolbar = """""" % (weburl, ln) if not editable: disabled = 'disabled="disabled"' readonly = 'readonly="readonly"' toolbar = '' #First column: template code and preview out += ''' ''' % {'code':code, 'ln':ln, 'weburl':weburl, 'filename':filename, 'ln_for_preview':ln_for_preview, 'pattern_for_preview':pattern_for_preview } #Second column Print documentation out += '''

Format template code

%(label_hide_doc)s

%(toolbar)s %(code)s

Preview

Content-type (MIME): Language: Search Pattern:

Elements Documentation

Search for:

''' % {'weburl':weburl, 'ln':ln} return out def tmpl_admin_format_template_show_short_doc(self, ln, format_elements): """ Prints the format element documentation in a condensed way to display inside format template editor. This page is different from others: it is displayed inside a tag in template tmpl_admin_format_template_show. @param ln language @param format_elements a list of format elements structures as returned by get_format_elements """ out = ''' <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>BibFormat Short Documentation of Format Elements</title> <link rel="stylesheet" href="%(weburl)s/img/cds.css"> <script src="%(weburl)s/admin/bibformat/js_quicktags.js" type="text/javascript"></script> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body> <script type="text/javascript"> function toggle_visibility(element, show, r,g,b){ var children = element.childNodes var child for(x=0; x<children.length; x++){ if (children[x].id == 'params'){ child = children[x] } } if (show=='show'){ element.style.background='rgb(201, 218, 255)' element.style.cursor='pointer' child.style.display='' } else { element.style.background="rgb("+r+","+g+","+b+")" child.style.display='none' } } ///// FROM JS QuickTags /////// // Copyright (c) 2002-2005 Alex King // http://www.alexking.org/ // // Licensed under the LGPL license // http://www.gnu.org/copyleft/lesser.html function insertAtCursor(myField, myValue) { //IE support if (document.selection) { myField.focus(); sel = document.selection.createRange(); sel.text = myValue; } //MOZILLA/NETSCAPE support else if (myField.selectionStart || myField.selectionStart == '0') { var startPos = myField.selectionStart; var endPos = myField.selectionEnd; myField.value = myField.value.substring(0, startPos) + myValue + myField.value.substring(endPos, myField.value.length); } else { myField.value += myValue; } } ///// END FROM JS QuickTags ///// function insert_my_code_into_container(code){ var codeArea = parent.document.getElementById("code"); if (codeArea.readOnly == false){ //var clean_code = code.replace(=#,'="'); //clean_code = clean_code.replace(# ,'" '); insertAtCursor(codeArea, code); } } </script> ''' % {'weburl': sweburl or weburl} if len(format_elements) == 0: out += ''' <em>No format elements found</em> ''' else: line = 0 #Print elements doc for format_element in format_elements: format_attributes = format_element['attrs'] row_content = "" name = format_attributes['name'] description = format_attributes['description'] params = [x['name'] + '=\u0022'+x['default']+'\u0022' for x in format_attributes['params']] builtin_params = [x['name'] + '=\u0022'+x['default']+'\u0022' for x in format_attributes['builtin_params']] code = "<BFE_" + name + ' ' + ' '.join(builtin_params)+ ' ' + ' '.join(params) +"/>" if line % 2: row_content += '''<div onmouseover="toggle_visibility(this, 'show', 235, 247, 255);" onmouseout="toggle_visibility(this, 'hide', 235, 247, 255);" style="background-color: rgb(235, 247, 255);" onclick="insert_my_code_into_container('%s')" ><hr/>''' % code else: row_content += '''<div onmouseover="toggle_visibility(this, 'show', 255, 255, 255);" onmouseout="toggle_visibility(this, 'hide', 255, 255, 255);" onclick="insert_my_code_into_container('%s')" >''' % code params_names = "" for param in format_attributes['params']: params_names += "<b>"+param['name'] +'</b> ' row_content += ''' <code> <b><BFE_%(name)s/></b><br/></code> <small>%(description)s.</small> <div id="params" style="display:none;"> <ul> ''' % {'params_names':params_names, 'name':name, 'description':description} for param in format_attributes['params']: row_content += ''' <li><small><b>%(name)s</b>: %(description)s</small></li> ''' % {'name':param['name'], 'description':param['description']} for param in format_attributes['builtin_params']: row_content += ''' <li><small><b>%(name)s</b>: %(description)s</small></li> ''' % {'name':param['name'], 'description':param['description']} row_content += '</ul></div>' if line % 2: row_content += '''<hr/></div>''' else: row_content += '</div>' line += 1 out += row_content out += '''</body></html>''' return indent_text(out) def tmpl_admin_format_templates_management(self, ln, formats): """ Returns the management console for formats. Includes list of formats and associated administration tools. @param ln language @param formats a list of dictionaries with formats attributes @return format management console as html """ _ = gettext_set_language(ln) # load the right message language #top of the page and table header out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small>%(manage_format_templates)s</small> </td> <td>1. <small><a href="output_formats_manage?ln=%(ln)s">%(manage_output_formats)s</a> </td> <td>2. <small><a href="format_elements_doc?ln=%(ln)s">%(format_elements_documentation)s</a></small> </td> <td>3. <small><a href="kb_manage?ln=%(ln)s">%(manage_knowledge_bases)s</a></small> </td> </tr> </table> <p>From here you can create, edit or delete formats templates. Have a look at the <a href="format_elements_doc?ln=%(ln)s">format elements documentation</a> to learn which elements you can use in your templates.</p> <table class="admin_wvar" width="95%%" cellspacing="0"> <tr> <th class="adminheaderleft" > </th> <th class="adminheaderleft" >%(name)s</th> <th class="adminheaderleft" >%(description)s</th> <th class="adminheaderleft" >%(status)s</th> <th class="adminheaderleft" >%(last_modification_date)s</th> <th class="adminheadercenter" >%(action)s   [<a href="%(weburl)s/admin/bibformat/guide.html#formatTemplates">?</a>]</th> </tr> ''' % {'name':_("Name"), 'description':_("Description"), 'menu': _("Menu"), 'status':_("Status"), 'last_modification_date':_("Last Modification Date"), 'action':_("Action"), 'ln':ln, 'manage_output_formats':_("Manage Output Formats"), 'manage_format_templates':_("Manage Format Templates"), 'format_elements_documentation':_("Format Elements Documentation"), 'manage_knowledge_bases':_("Manage Knowledge Bases"), 'weburl':weburl} #table content: formats names, description and buttons if len(formats) == 0: out += '''<tr> <td colspan="6" class="admintd" align="center"><em>No format</em></td> </tr>''' else: line = 0 for attrs in formats: filename = attrs['filename'] if filename == "": filename = " " name = attrs['name'] if name == "": name = " " description = attrs['description'] if description == "": description = " " last_mod_date = attrs['last_mod_date'] status = attrs['status'] disabled = "" if attrs['editable'] == False: disabled = 'disabled="disabled"' style = 'style="vertical-align: middle;' if line % 2: style = 'style="vertical-align: middle;background-color: rgb(235, 247, 255);' line += 1 row_content = '''<tr> <td class="admintdright" %(style)s"> </td> <td class="admintdleft" %(style)s white-space: nowrap;"><a href="format_template_show?bft=%(filename)s&ln=%(ln)s">%(name)s</a></td> <td class="admintdleft" %(style)s" >%(description)s</td> <td class="admintdleft" %(style)s white-space: nowrap;" >%(status)s</td> <td class="admintdleft" %(style)s white-space: nowrap;" >%(last_mod_date)s</td> <td class="admintd" %(style)s white-space: nowrap;"> <form method="post" action="format_template_delete?ln=%(ln)s&bft=%(filename)s"> <input class="adminbutton" type="submit" value="%(delete)s" %(disabled)s/> </form> </td> </tr> ''' % {'filename':filename, 'name':name, 'description':description, 'ln':ln, 'style':style, 'disabled':disabled, 'last_mod_date':last_mod_date, 'status':status, 'delete':_("Delete") } out += row_content #table footer, buttons and bottom of the page out += ''' <tr> <td align="left" colspan="3"> <form action="format_templates_manage?ln=%(ln)s"> <input type="hidden" name="checking" value="1"></input> <input class="adminbutton" type="submit" value="%(extensive_checking)s"/> </form> </td> <td align="right" colspan="3"> <form action="format_template_add?ln=%(ln)s"> <input class="adminbutton" type="submit" value="%(add_format_template)s"/> </form> </td> </tr> </table> ''' % {'ln':ln, 'add_format_template':_("Add New Format Template"), 'extensive_checking':_("Check Format Templates Extensively")} return indent_text(out) def tmpl_admin_output_formats_management(self, ln, output_formats): """ Returns the main management console for formats. Includes list of formats and associated administration tools. @param ln language @param output_formats a list of output formats @return main management console as html """ _ = gettext_set_language(ln) # load the right message language #top of the page and table header out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="format_templates_manage?ln=%(ln)s">%(manage_format_templates)s</a></small> </td> <td>1. <small>%(manage_output_formats)s</small> </td> <td>2. <small><a href="format_elements_doc?ln=%(ln)s">%(format_elements_documentation)s</a></small> </td> <td>3. <small><a href="kb_manage?ln=%(ln)s">%(manage_knowledge_bases)s</a></small> </td> </tr> </table> <p>From here you can add, edit or delete output formats available for collections. Output formats define which template to use. <br/>To edit templates go to the <a href="format_templates_manage?ln=%(ln)s">template administration page</a>.</p> <table class="admin_wvar" width="95%%" cellspacing="0"> <tr> <th class="adminheaderleft" > </th> <th class="adminheaderleft" ><a href="output_formats_manage?ln=%(ln)s&sortby=code">%(code)s</a></th> <th class="adminheaderleft" ><a href="output_formats_manage?ln=%(ln)s&sortby=name">%(name)s</a></th> <th class="adminheaderleft" >%(description)s</th> <th class="adminheaderleft" >%(status)s</th> <th class="adminheaderleft" >%(last_modification_date)s</th> <th class="adminheadercenter" >%(action)s   [<a href="%(weburl)s/admin/bibformat/guide.html#outputFormats">?</a>]</th> </tr> ''' % {'code':_("Code"), 'name':_("Name"), 'description':_("Description"), 'status':_("Status"), 'last_modification_date':_("Last Modification Date"), 'action':_("Action"), 'ln':ln, 'manage_output_formats':_("Manage Output Formats"), 'manage_format_templates':_("Manage Format Templates"), 'format_elements_documentation':_("Format Elements Documentation"), 'manage_knowledge_bases':_("Manage Knowledge Bases"), 'menu': _("Menu"), 'weburl':weburl} #table content: formats names, description and buttons if len(output_formats) == 0: out += '''<tr> <td colspan="5" class="admintd" align="center"><em>No format</em></td> </tr>''' else: line = 0 for output_format in output_formats: format_attributes = output_format['attrs'] name = format_attributes['names']['generic'] if name == "": name = " " description = format_attributes['description'] if description == "": description = " " code = format_attributes['code'] if code == "": code = " " last_mod_date = output_format['last_mod_date'] status = output_format['status'] disabled = "" if output_format['editable'] == False: disabled = 'disabled="disabled"' style = "vertical-align: middle;" if line % 2: style = 'vertical-align: middle; background-color: rgb(235, 247, 255);' line += 1 row_content = '''<tr> <td class="admintdright" style="%(style)s"> </td> <td class="admintdleft" style="white-space: nowrap; %(style)s"> <a href="output_format_show?bfo=%(code)s">%(code)s</a> </td> <td class="admintdleft" style="white-space: nowrap; %(style)s"> <a href="output_format_show?bfo=%(code)s">%(name)s</a> </td> <td class="admintdleft"style="%(style)s" > %(description)s </td> <td class="admintd" style="white-space: nowrap; %(style)s" >%(status)s</td> <td class="admintdleft" style="white-space: nowrap;%(style)s" >%(last_mod_date)s</td> <td class="admintd" style="white-space: nowrap; %(style)s"> <form method="POST" action="output_format_delete?ln=%(ln)s&bfo=%(code)s"> <input class="adminbutton" type="submit" value="Delete" %(disabled)s /> </form> </td> </tr> ''' % {'style':style, 'code':code, 'description':description, 'name':name, 'ln':ln, 'disabled':disabled, 'last_mod_date':last_mod_date, 'status':status} out += row_content #table footer, buttons and bottom of the page out += ''' <tr> <td align="right" colspan="7"> <form method="GET" action="output_format_add?ln=%(ln)s"> <input class="adminbutton" type="submit" value="%(add_output_format)s"/> </form> </td> </tr> </table> ''' % {'ln':ln, 'add_output_format':_("Add New Output Format")} return indent_text(out) def tmpl_admin_output_format_show(self, ln, code, name, rules, default, format_templates, editable): """ Returns the content of an output format rules is an ordered list of dict (sorted by evaluation order), with keys 'field', 'value' and 'template' IMPORTANT: we display rules evaluation index starting at 1 in interface, but we start internally at 0 @param ln language @param code the code of the output to show @param name the name of this output format @param rules the list of rules for this output format @param default the default format template of the output format @param format_templates the list of format_templates @param editable True if we let user edit, else False @return the management console for this output format """ _ = gettext_set_language(ln) out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="output_formats_manage?ln=%(ln)s">%(close_output_format)s</a></small> </td> <td>1. <small>%(rules)s</small> </td> <td>2. <small><a href="output_format_show_attributes?ln=%(ln)s&bfo=%(code)s">%(modify_output_format_attributes)s</a></small> </td> <td>3. <small><a href="output_format_show_dependencies?ln=%(ln)s&bfo=%(code)s">%(check_dependencies)s</a></small> </td> </tr> </table> <p>Define here the rules the specifies which template to use for a given record.</p> ''' % {'code':code, 'ln':ln, 'menu':_("menu"), 'close_output_format':_("Close Output Format"), 'rules':_("Rules"), 'modify_output_format_attributes':_("Modify Output Format Attributes"), 'check_dependencies':_("Check Dependencies") } out += ''' <form name="rules" action="output_format_show?ln=%(ln)s&bfo=%(code)s" method="post"> <table> <tr> <td> ''' % {'ln': ln, 'code':code} disabled = "" readonly = "" if not editable: disabled = 'disabled="disabled"' readonly = 'readonly="readonly"' if len(rules) == 0: out += '''<p align="center"><em>No special rule</em></p>''' line = 1 for rule in rules: out += ''' <table align="center" class="admin_wvar" cellspacing="0"> <tr> ''' out += ''' <td rowspan="2" class="adminheader" style="vertical-align: middle;">''' if line > 1: out += ''' <input type="image" src="%(weburl)s/img/smallup.gif" alt="Increase priority of rule %(row)s" name="+ %(row)s" value="+ %(row)s" %(disabled)s/></div> ''' % {'weburl':weburl, 'row':line, 'disabled':disabled} out += '''<div>%(row)s</div>''' % { 'row':line} if line < len(rules): out += ''' <input type="image" src="%(weburl)s/img/smalldown.gif" alt="Decrease priority of rule %(row)s" name="- %(row)s" value="- %(row)s" %(disabled)s/> ''' % {'weburl':weburl, 'row':line, 'disabled':disabled} out += '''</td> <td class="adminheaderleft"> </td> ''' out += ''' <td class="adminheaderleft" style="white-space: nowrap;"> Use template <select name="r_tpl" %(disabled)s>''' % {'disabled':disabled} for template in format_templates: attrs = format_templates[template]['attrs'] attrs['template'] = template if template != rule['template']: out += '''<option value="%(template)s">%(name)s</option>''' % attrs else: out += '''<option value="%(template)s" selected="selected">%(name)s</option>''' % attrs if not format_templates.has_key(rule['template']) and rule['template'] != "": - #case where a non existing format tempate is use in output format + #case where a non existing format template is use in output format #we need to add it as option out += '''<option value="%s" selected="selected">%s</option>''' % (rule['template'], rule['template']) + ################ FIXME remove when migration is done #################### + #Let the user choose a non existing template, that is a placeholder + #meaning that the template has not been migrated + selected = '' + if rule['template'] == 'migration_in_progress': + selected = 'selected="selected"' + out += '''<option value="migration_in_progress" %s>defined in old BibFormat</option>''' % selected + ################ END FIXME #################### + out += '''</select> if field  <input type="text" name="r_fld" value="%(field)s" size="10" %(readonly)s/> is equal to <input type="text" value="%(value)s" name="r_val" %(readonly)s/> </td> <td class="adminheaderright" style="vertical-align: middle;">  [<a href="%(weburl)s/admin/bibformat/guide.html#rulesOutputFormat">?</a>] </td> </tr> ''' % {'weburl':weburl, 'field': rule['field'], 'value':rule['value'], 'readonly':readonly} out += ''' <tr> <td colspan ="3" class="adminheaderright" style="vertical-align: middle; white-space: nowrap;"> <input type="submit" class="adminbutton" name="r_upd" value="%(remove_rule_label)s %(row)s" %(disabled)s/>  </td> </tr> </table> ''' % {'remove_rule_label': _("Remove Rule"), 'row':line, 'disabled':disabled} line += 1 out += ''' <table width="100%" align="center" class="admin_wvar" cellspacing="0"> <tr> ''' out += ''' <td width="30" class="adminheaderleft"> </td> <td class="adminheaderleft">By default use <select id="default" name="default" %(disabled)s>''' % {'disabled':disabled} for template in format_templates: attrs = format_templates[template]['attrs'] attrs['template'] = template if template != default: out += '''<option value="%(template)s">%(name)s</option>''' % attrs else: out += '''<option value="%(template)s" selected="selected">%(name)s</option>''' % attrs if not format_templates.has_key(default) and default!= "": #case where a non existing format tempate is use in output format #we need to add it as option (only if it is not empty string) out += '''<option value="%s" selected="selected">%s</option>''' % (default,default) + + ################ FIXME remove when migration is done #################### + #Let the user choose a non existing template, that is a placeholder + #meaning that the template has not been migrated + selected = '' + if default == 'migration_in_progress': + selected = 'selected="selected"' + out += '''<option value="migration_in_progress" %s>defined in old BibFormat</option>''' % selected + ################ END FIXME #################### + out += '''</select></td> </tr> </table> <div align="right"> <input tabindex="6" class="adminbutton" type="submit" name="r_upd" value="%(add_new_rule_label)s" %(disabled)s/> <input tabindex="7" class="adminbutton" type="submit" name="r_upd" value="%(save_changes_label)s" %(disabled)s/> </div> </td> </tr> </table> </form> ''' % {'add_new_rule_label':_("Add New Rule"), 'save_changes_label':_("Save Changes"), 'disabled':disabled } return indent_text(out) def tmpl_admin_output_format_show_attributes(self, ln, name, description, content_type, code, names_trans, editable): """ Returns a page to change output format name and description names_trans is an ordered list of dicts with keys 'lang' and 'trans' @param ln language @param name the name of the format @param description the description of the format @param code the code of the format @param content_type the (MIME) content type of the ouput format @param names_trans the translations in the same order as the languages from get_languages() @param editable True if we let user edit, else False @return editor for output format attributes """ _ = gettext_set_language(ln) # load the right message language out = "" out += ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="output_formats_manage?ln=%(ln)s">%(close_output_format)s</a></small> </td> <td>1. <small><a href="output_format_show?ln=%(ln)s&bfo=%(code)s">%(rules)s</a></small> </td> <td>2. <small>%(modify_output_format_attributes)s</small> </td> <td>3. <small><a href="output_format_show_dependencies?ln=%(ln)s&bfo=%(code)s">%(check_dependencies)s</a></small> </td> </tr> </table><br/> ''' % {'ln':ln, 'code':code, 'close_output_format':_("Close Output Format"), 'rules':_("Rules"), 'modify_output_format_attributes':_("Modify Output Format Attributes"), 'check_dependencies':_("Check Dependencies"), 'menu':_("Menu") } disabled = "" readonly = "" if not editable: disabled = 'disabled="disabled"' readonly = 'readonly="readonly"' out += ''' <form action="output_format_update_attributes?ln=%(ln)s&bfo=%(code)s" method="POST"> <table class="admin_wvar" cellspacing="0"> <tr> <th colspan="2" class="adminheaderleft"> Output Format Attributes [<a href="%(weburl)s/admin/bibformat/guide.html#attrsOutputFormat">?</a>]</th> </tr> <tr> <td class="admintdright"><label for="outputFormatCode">Code</label>: </td> <td><input tabindex="1" name="code" type="text" id="outputFormatCode" maxlength="6" size="6" value="%(code)s" %(readonly)s/></td> </tr> <td class="admintdright"><label for="outputFormatContentType">Content type</label>: </td> <td><input tabindex="2" name="content_type" type="text" id="outputFormatContentType" size="25" value="%(content_type)s" %(readonly)s/> <small>Mime content-type. Specifies how the browser should handle this output.</small></td> <tr> <td class="admintdright"><label for="outputFormatName">Name</label>: </td> <td><input tabindex="3" name="name" type="text" id="outputFormatName" size="25" value="%(name)s" %(readonly)s/></td> </tr> ''' % {'name': name, 'ln':ln, 'code':code, 'content_type':content_type, 'readonly':readonly, 'weburl':weburl} #Add translated names i = 3 for name_trans in names_trans: i += 1 out += ''' <tr> <td class="admintdright"><label for="outputFormatName%(i)s">%(lang)s Name</label>: </td> <td><input tabindex="%(i)s" name="names_trans" type="text" id="outputFormatName%(i)s" size="25" value="%(name)s" %(readonly)s/></td> </tr>''' % {'name':name_trans['trans'], 'lang':name_trans['lang'], 'i':i, 'readonly':readonly} #Description and end of page out += ''' <tr> <td class="admintdright" valign="top"><label for="outputFormatDescription">Description</label>: </td> <td><textarea tabindex="%(tabindexdesc)s" name="description" id="outputFormatDescription" rows="4" cols="25" %(readonly)s>%(description)s</textarea> </td> </tr> <tr> <td colspan="2" align="right"><input tabindex="%(tabindexbutton)s" class="adminbutton" type="submit" value="Update Output Format Attributes" %(disabled)s/></td> </tr> </table> </form> ''' % {'description': description, 'tabindexdesc': i + 1, 'tabindexbutton': i + 2, 'readonly':readonly, 'disabled':disabled} return out def tmpl_admin_output_format_show_dependencies(self, ln, name, code, format_templates): """ Shows the dependencies of the given format. @param name the name of the output format @param code the code of the output format @param format_templates format templates that depend on this format (and also elements and tags) """ _ = gettext_set_language(ln) # load the right message language out = ''' <table class="admin_wvar"> <tr><th colspan="4" class="adminheaderleft" cellspacing="0">%(menu)s</th></tr> <tr> <td>0. <small><a href="output_formats_manage?ln=%(ln)s">%(close_output_format)s</a></small> </td> <td>1. <small><a href="output_format_show?ln=%(ln)s&bfo=%(code)s">%(rules)s</a></small> </td> <td>2. <small><a href="output_format_show_attributes?ln=%(ln)s&bfo=%(code)s">%(modify_output_format_attributes)s</a></small> </td> <td>3. <small>%(check_dependencies)s</small> </td> </tr> </table><br/> <table width="90%%" class="admin_wvar" cellspacing="0"><tr> <th class="adminheaderleft">Output Formats that use %(name)s</th> <th class="adminheaderleft">Format Elements used by %(name)s</th> <th class="adminheaderleft">Tags Called*</th> </tr> ''' % {'name': name, 'code': code, 'ln':ln, 'close_output_format':_("Close Output Format"), 'rules':_("Rules"), 'modify_output_format_attributes':_("Modify Output Format Attributes"), 'check_dependencies':_("Check Dependencies"), 'menu': _("Menu") } if len(format_templates) == 0: out += '''<tr><td colspan="3"><p align="center"> <i>This output format uses no format template.</i></p></td></tr>''' for format_template in format_templates: name = format_template['name'] filename = format_template['filename'] out += '''<tr><td><a href="format_template_show?bft=%(filename)s&ln=%(ln)s">%(name)s</a></td> <td>&nbsp</td><td>&nbsp</td></tr>''' % {'filename':filename, 'name':name, 'ln':ln} for format_element in format_template['elements']: name = format_element['name'] filename = format_element['filename'] out += '''<tr><td>&nbsp</td> <td><a href="format_elements_doc?ln=%(ln)s#%(anchor)s">%(name)s</a></td> <td>&nbsp</td></tr>''' % {'anchor':name.upper(), 'name':name, 'ln':ln} for tag in format_element['tags']: out += '''<tr><td>&nbsp</td><td>&nbsp</td> <td>%(tag)s</td></tr>''' % {'tag':tag} out += ''' </table> <b>*Note</b>: Some tags linked with this format template might not be shown. Check manually. ''' return out def tmpl_admin_format_elements_documentation(self, ln, format_elements): """ Returns the main management console for format elements. Includes list of formats elements and associated administration tools. @param ln language @param formats a list of dictionaries with formats elements attributes @return main management console as html """ _ = gettext_set_language(ln) # load the right message language #top of the page and table header out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="format_templates_manage?ln=%(ln)s">%(manage_format_templates)s</a></small> </td> <td>1. <small><a href="output_formats_manage?ln=%(ln)s">%(manage_output_formats)s</a></small> </td> <td>2. <small>%(format_elements_documentation)s</small> </td> <td>3. <small><a href="kb_manage?ln=%(ln)s">%(manage_knowledge_bases)s</a></small> </td> </tr> </table> <p>Here you can read the APIs of the formats elements, the elementary bricks for formats.</p> ''' % {'ln':ln, 'menu': _("Menu"), 'manage_output_formats':_("Manage Output Formats"), 'manage_format_templates':_("Manage Format Templates"), 'format_elements_documentation':_("Format Elements Documentation"), 'manage_knowledge_bases':_("Manage Knowledge Bases") } #table content: formats names, description and actions if len(format_elements) == 0: out += ''' <em>No format elements found</em> ''' else: #Print summary of elements (name + decription) out += '''<h2>Summary table of elements</h2>''' out += '''<table width="90%">''' for format_element in format_elements: format_attributes = format_element['attrs'] out += ''' <tr> <td> <code><a href="#%(name)s"><BFE_%(name)s/></a></code> </td> <td> %(description)s </td> </tr> ''' % format_attributes out += "</table>" #Print details of elements out += '''<h2>Details of elements</h2>''' for format_element in format_elements: format_attributes = format_element['attrs'] element_name = format_attributes['name'] out += self.tmpl_admin_print_format_element_documentation(ln, element_name, format_attributes) #table footer, buttons and bottom of the page out += ''' <table align="center" width="95%"> </table>''' return indent_text(out) def tmpl_admin_print_format_element_documentation(self, ln, name, attributes, print_see_also=True): """ Prints the formatted documentation of a single element @param ln language @param name the name of the element @param attributes the attributes of the element, as returned by get_format_element_attrs_from_* @param print_see_also if True, prints links to other sections related to element """ params_names = "" for param in attributes['params']: params_names += "<b>"+param['name'] +'</b>="..." ' out = ''' <a name="%(name)s"></a><h3>%(name)s</h3> <b><BFE_%(name)s</b> %(params_names)s<b>/></b><br/><br/>       <em>%(description)s.</em><br/><br/>       <b>Parameters:</b><br/> ''' % {'params_names': params_names, 'name':name, 'description': attributes['description']} for param in attributes['params']: out += '''             <code>%(name)s</code> - %(description)s. ''' % param if param['default'] != "": default = cgi.escape(param['default']) if default.strip() == "": default = " " out += ''' Default value is «<code>%s</code>» ''' % default out += '<br/>' for param in attributes['builtin_params']: out += '''             <code>%(name)s</code> - %(description)s. ''' % param if param['default'] != "": default = cgi.escape(param['default']) if default.strip() == "": default = " " out += ''' Default value is «<code>%s</code>» ''' % default out += '<br/>' if print_see_also: out += '''<br/>        <b>See also:</b><br/>''' for element in attributes['seealso']: element_name = element.split('.')[0].upper() out += '''             <a href="#%(name)s">Element <em>%(name)s</em></a><br/>''' % {'name':element_name} out += '''             <a href ="format_element_show_dependencies?ln=%(ln)s&bfe=%(bfe)s">Dependencies of this element</a><br/>             <a href ="validate_format?ln=%(ln)s&bfe=%(bfe)s">The correctness of this element</a><br/>             <a href ="format_element_test?ln=%(ln)s&bfe=%(bfe)s">Test this element</a><br/> ''' % {'ln':ln, 'bfe':name} return out def tmpl_admin_format_element_show_dependencies(self, ln, name, format_templates, tags): """ Shows the dependencies of the given format element @param name the name of the element @param format_templates format templates that depend on this element @param tags the tags that are called by this format element """ out = ''' <p>Go back to <a href="format_elements_doc?ln=%(ln)s#%(name)s">documentation</a></p> ''' % {'ln':ln, 'name':name.upper()} out += ''' <table width="90%" class="admin_wvar" cellspacing="0"><tr>''' out += ''' <th class="adminheaderleft">Format Templates that use %(name)s</th> <th class="adminheaderleft">Tags Called*</th> </tr> <tr> <td> <br/>''' % {"name": name} #Print format elements (and tags) if len(format_templates) == 0: out += '''<p align="center"> <i>This format element is not used in any format template.</i></p>''' for format_template in format_templates: name = format_template['name'] filename = format_template['filename'] out += '''<a href="format_template_show?ln=%(ln)s&bft=%(filename)s">%(name)s</a><br/>''' % {'filename':filename, 'name':name, 'ln':ln} #Print tags out += "</td><td> <br/>" if len(tags) == 0: out += '''<p align="center"> <i>This format element uses no tag.</i></p>''' for tag in tags: out += '''%(tag)s<br/>''' % {'tag':tag} out += ''' </td> </tr> </table> <b>*Note</b>: Some tags linked with this format template might not be shown. Check manually. ''' return out def tmpl_admin_format_element_test(self, ln, bfe, description, param_names, param_values, param_descriptions, result): """ Prints a page where the user can test the given format element with his own parameters. @param ln language @param bfe the format element name @param description a description of the element @param param_names a list of parameters names/labels @param param_values a list of values for parameters @param param_descriptions a list of description for parameters @param result the result of the evaluation """ out = ''' <p>Go back to <a href="format_elements_doc?ln=%(ln)s#%(name)s">documentation</a></p> ''' % {'ln':ln, 'name':bfe.upper()} out += ''' <h3><BFE_%(bfe)s /></h3> <p>%(description)s</p> <table width="100%%"><tr><td> <form method="post" action="format_element_test?ln=%(ln)s&bfe=%(bfe)s"> <table> ''' % {'bfe':bfe, 'ln':ln, 'description':description } for i in range(len(param_names)): out += ''' <tr> <td class="admintdright">%(name)s</td> <td class="admintdright"><input type="text" name="param_values" value="%(value)s"/></td> <td class="admintdleft">%(description)s </td> </tr> ''' % {'name':param_names[i], 'value':param_values[i], 'description':param_descriptions[i]} out += ''' <tr><td colspan="2" class="admintdright"><input type="submit" class="adminbutton" value="Test!"/></td> <td> </td> </tr> </table> </form> <fieldset style="display:inline;margin-left:auto;margin-right:auto;"> <legend>Result:</legend>%(result)s</fieldset> ''' % {'result':result} out += ''' </td></tr><tr><td> ''' #out += self.tmpl_admin_print_format_element_documentation(ln, bfe, attributes, False) out += '''</td></tr></table>''' return out def tmpl_admin_add_format_element(self, ln): """ Shows how to add a format element (mainly doc) @param ln language """ _ = gettext_set_language(ln) # load the right message language out = ''' <p>To add a new basic element (only fetch the value of a field, without special post-processing), go to the <a href="%(weburl)sadmin/bibindex/bibindexadmin.py/field">BibEdit "Manage Logical Fields"</a> page and add a name for a field. Make sure that the name is unique and corresponds well to the field. For example, to add an element that fetch the value of field 245__%, add a new logical field with name "title" and field "245__%". Then in your template, call BFE_TITLE to print the title.</p> <p>To add a new complex element (for eg. special formatting of the field, condition on the value, etc.) you must go to the lib/python/invenio/bibformat_elements directory of your Invenio installation, and add a new format element file. Read documentation for more information.</p> ''' % {'weburl':weburl} return out def tmpl_admin_kbs_management(self, ln, kbs): """ Returns the main management console for knowledge bases. @param ln language @param kbs a list of dictionaries with knowledge bases attributes @return main management console as html """ _ = gettext_set_language(ln) # load the right message language #top of the page and table header out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="format_templates_manage?ln=%(ln)s">%(manage_format_templates)s</a></small> </td> <td>1. <small><a href="output_formats_manage?ln=%(ln)s">%(manage_output_formats)s</a></small> </td> <td>2. <small><a href="format_elements_doc?ln=%(ln)s">%(format_elements_documentation)s</a></small> </td> <td>3. <small>%(manage_knowledge_bases)s</small> </td> </tr> </table> <table class="admin_wvar" width="95%%" cellspacing="0"> <tr> <th class="adminheaderleft" > </th> <th class="adminheaderleft" >Name</th> <th class="adminheaderleft" >Description</th> <th class="adminheadercenter" >Action   [<a href="%(weburl)s/admin/bibformat/guide.html#KBs">?</a>]</th> </tr>''' % {'ln':ln, 'menu':_("Menu"), 'manage_output_formats':_("Manage Output Formats"), 'manage_format_templates':_("Manage Format Templates"), 'format_elements_documentation':_("Format Elements Documentation"), 'manage_knowledge_bases':_("Manage Knowledge Bases"), 'weburl':weburl} #table content: kb names, description and actions if len(kbs) == 0: out += '''<tr> <td colspan="5" class="admintd" align="center"><em>No Knowledge Base</em></td> </tr>''' else: line = 0 for kb_attributes in kbs : kb_attributes['style'] = "" if line % 2: kb_attributes['style'] = 'background-color: rgb(235, 247, 255);' line += 1 kb_attributes['ln'] = ln kb_attributes['weburl'] = weburl row_content = '''<tr> <td class="admintdright" style="vertical-align: middle; %(style)s"> </td> <td class="admintdleft" style="vertical-align: middle; %(style)s white-space: nowrap;"><a href="kb_show?ln=%(ln)s&amp;kb=%(id)s">%(name)s</a></td> <td class="admintdleft"style="vertical-align: middle; %(style)s">%(description)s</td> <td class="admintd" style="vertical-align: middle; %(style)s white-space: nowrap;"> <form action="kb_delete?ln=%(ln)s" type="POST"> <input type="submit" class="adminbutton" value="Delete"> <input type="hidden" id="kb" name="kb" value="%(id)s"> </form> </td> </tr> ''' % kb_attributes out += row_content #table footer, buttons and bottom of the page out += ''' </table> <table align="center" width="95%"> <tr> <td align="left" valign="top"> </td> ''' out += ''' <td align="right"> <form action="kb_add?ln=%(ln)s"> <input class="adminbutton" type="submit" value="Add New Knowledge Base"/> </form> </td> </tr> </table>''' % {'ln': ln} return indent_text(out) def tmpl_admin_kb_show(self, ln, kb_id, kb_name, mappings, sortby): """ Returns the content of a knowledge base. @param ln language @param kb_id the id of the kb @param kb_name the name of the kb @param content a list of dictionaries with mappings @param sortby the sorting criteria ('from' or 'to') @return main management console as html """ _ = gettext_set_language(ln) # load the right message language #top of the page and main table that split screen in two parts out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="kb_manage?ln=%(ln)s&sortby=%(sortby)s">%(close)s</a></small> </td> <td>1. <small>%(mappings)s</small> </td> <td>2. <small><a href="kb_show_attributes?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s">%(attributes)s</a></small> </td> <td>3. <small><a href="kb_show_dependencies?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s">%(dependencies)s</a></small> </td> </tr> </table> ''' % {'ln':ln, 'kb_id':kb_id, 'sortby':sortby, 'close': _("Close Editor"), 'mappings': _("Knowledge Base Mappings"), 'attributes':_("Knowledge Base Attributes"), 'dependencies':_("Knowledge Base Dependencies"), 'menu': _("Menu")} out += ''' <p>Here you can add new mappings to this base and change the base attributes.</p> <table width="100%" align="center"> <tr> ''' #First column of table: add mapping form out += ''' <td width="300" valign="top"> <form name="addNewMapping" action="kb_add_mapping?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s" method="post">''' % {'ln':ln, 'kb_id':kb_id, 'sortby':sortby} out += ''' <table class="admin_wvar" width="100%%" cellspacing="0"> <tr> <th colspan="2" class="adminheaderleft">Add New Mapping  [<a href="%(weburl)s/admin/bibformat/guide.html#addMappingKB">?</a>]</th> </tr> <tr> <td class="admintdright"><label for="mapFrom"><span style="white-space: nowrap;">Map From</span></label>: </td> <td><input tabindex="1" name="mapFrom" type="text" id="mapFrom" size="25"/></td> </tr> <tr> <td class="admintdright"><label for="mapTo">To</label>: </td> <td><input tabindex="2" name="mapTo" type="text" id="mapTo" size="25"/></td> </tr> <tr> <td colspan="2" align="right"><input tabindex="3" class="adminbutton" type="submit" value="Add new Mapping"/></td> </tr> </table> </form> </td> ''' % {'weburl':weburl} #Second column: mappings table #header and footer out += ''' <td valign="top"> <table class="admin_wvar"> <thead> <tr> <th class="adminheaderleft" width="25"> </th> <th class="adminheaderleft" width="34%%"><a href="kb_show?ln=%(ln)s&kb=%(kb_id)s&sortby=from">Map From</a></th> <th class="adminheaderleft"> </th> <th class="adminheaderleft" width="34%%"><a href="kb_show?ln=%(ln)s&kb=%(kb_id)s&sortby=to">To</a></th> <th class="adminheadercenter" width="25%%">Action   [<a href="%(weburl)s/admin/bibformat/guide.html#removeMappingKB">?</a>]</th> </tr> </thead> <tfoot> <tr> <td colspan="5"> </td> </tr> </tfoot> <tbody> ''' % {'ln':ln, 'kb_id':kb_id, 'weburl':weburl} #table content: key, value and actions if len(mappings) == 0: out += ''' <tr> <td colspan="5" class="admintd" align="center"><em>Knowledge base is empty</em></td> </tr></tbody>''' else: line = 0 tabindex_key = 6 tabindex_value = 7 tabindex_save_button = 8 for mapping in mappings: style = "vertical-align: middle;" if line % 2: style += 'background-color: rgb(235, 247, 255);' line += 1 tabindex_key += 3 tabindex_value += 3 tabindex_save_button += 3 row_content = ''' <tr> <td colspan="5"> <form action="kb_edit_mapping?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s" name="%(key)s" method="post"> <table> <tr> <td class="admintdright" style="%(style)s" width="5">   <input type="hidden" name="key" value="%(key)s"/> </td> <td class="admintdleft" style="%(style)s"> <input type="text" name="mapFrom" size="30" maxlength="255" value="%(key)s" tabindex="%(tabindex_key)s"/> </td> <td class="admintdleft" style="%(style)s white-space: nowrap;" width="5">=></td> <td class="admintdleft"style="%(style)s"> <input type="text" name="mapTo" size="30" value="%(value)s" tabindex="%(tabindex_value)s"> </td> <td class="admintd" style="%(style)s white-space: nowrap;"> <input class="adminbutton" type="submit" name="update" value="Save" tabindex="%(tabindex_save_button)s"/> <input class="adminbutton" type="submit" name="delete"value="Delete"/></td> </tr></table></form></td></tr> ''' % {'key': mapping['key'], 'value':mapping['value'], 'ln':ln, 'style':style, 'tabindex_key': tabindex_key, 'tabindex_value': tabindex_value, 'tabindex_save_button': tabindex_save_button, 'kb_id':kb_id, 'sortby':sortby} out += row_content #End of tables out += '''</tbody></table> </td> ''' out+= ''' <td width="20%"> </td> </tr> </table> ''' #add script that will put focus on first field of "add mapping" form out += ''' <script type="text/javascript"> self.focus();document.addNewMapping.mapFrom.focus() </script> ''' return indent_text(out) def tmpl_admin_kb_show_attributes(self, ln, kb_id, kb_name, description, sortby): """ Returns the attributes of a knowledge base. @param ln language @param kb_id the id of the kb @param kb_name the name of the kb @param description the description of the kb @param sortby the sorting criteria ('from' or 'to') @return main management console as html """ _ = gettext_set_language(ln) # load the right message language out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="kb_manage?ln=%(ln)s&sortby=%(sortby)s">%(close)s</a></small> </td> <td>1. <small><a href="kb_show?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s">%(mappings)s</a></small> </td> <td>2. <small>%(attributes)s</small> </td> <td>3. <small><a href="kb_show_dependencies?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s">%(dependencies)s</a></small> </td> </tr> </table> ''' % {'ln':ln, 'kb_id':kb_id, 'sortby':sortby, 'close': _("Close Editor"), 'menu': _("Menu"), 'mappings': _("Knowledge Base Mappings"), 'attributes':_("Knowledge Base Attributes"), 'dependencies':_("Knowledge Base Dependencies")} out += ''' <form name="updateAttributes" action="kb_update_attributes?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s" method="post"> <table class="admin_wvar" cellspacing="0"> <tr> ''' % {'ln':ln, 'kb_id':kb_id, 'sortby':sortby} out += ''' <th colspan="2" class="adminheaderleft">%(kb_name)s attributes [<a href="%(weburl)s/admin/bibformat/guide.html#attrsKB">?</a>]</th>''' % {'kb_name': kb_name, 'weburl': weburl} out += ''' </tr> <tr> <td class="admintdright"> <input type="hidden" name="key" value="%(kb_id)s"/> <label for="name">Name</label>: </td> <td><input tabindex="4" name="name" type="text" id="name" size="25" value="%(kb_name)s"/></td> </tr> <tr> <td class="admintdright" valign="top"><label for="description">Description</label>: </td> <td><textarea tabindex="5" name="description" id="description" rows="4" cols="25">%(kb_description)s</textarea> </td> </tr> <tr> <td> </td> <td align="right"><input tabindex="6" class="adminbutton" type="submit" value="Update Base Attributes"/></td> </tr> </table> </form></td>''' % {'kb_name': kb_name, 'kb_description': description, 'kb_id':kb_id} return indent_text(out) def tmpl_admin_kb_show_dependencies(self, ln, kb_id, kb_name, sortby, format_elements): """ Returns the attributes of a knowledge base. @param ln language @param kb_id the id of the kb @param kb_name the name of the kb @param sortby the sorting criteria ('from' or 'to') @param format_elements the elements that use this kb """ _ = gettext_set_language(ln) # load the right message language out = ''' <table class="admin_wvar" cellspacing="0"> <tr><th colspan="4" class="adminheaderleft">%(menu)s</th></tr> <tr> <td>0. <small><a href="kb_manage?ln=%(ln)s&sortby=%(sortby)s">%(close)s</a></small> </td> <td>1. <small><a href="kb_show?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s">%(mappings)s</a></small> </td> <td>2. <small><a href="kb_show_attributes?ln=%(ln)s&kb=%(kb_id)s&sortby=%(sortby)s">%(attributes)s</a></small> </td> <td>3. <small>%(dependencies)s</small> </td> </tr> </table> <br/>''' % {'ln':ln, 'kb_id':kb_id, 'sortby':sortby, 'close': _("Close Editor"), 'menu' : _("Menu"), 'mappings': _("Knowledge Base Mappings"), 'attributes':_("Knowledge Base Attributes"), 'dependencies':_("Knowledge Base Dependencies")} out += ''' <table width="90%" class="admin_wvar" cellspacing="0"><tr>''' out += ''' <th class="adminheaderleft">Format Elements used by %(name)s*</th> </tr> <tr> <td valign="top"> ''' % {"name": kb_name} if len(format_elements) == 0: out += '<p align="center"><i>This knowledge base is not used in any format elements.</i></p>' for format_element in format_elements: name = format_element['name'] out += '''<a href="format_elements_doc?ln=%(ln)s#%(anchor)s">%(name)s</a><br/>''' % {'name':"bfe_"+name.lower(), 'anchor':name.upper(), 'ln':ln} out += ''' </td> </tr> </table> <b>*Note</b>: Some knowledge base usages might not be shown. Check manually. ''' return indent_text(out) def tmpl_admin_validate_format(self, ln, errors): """ Prints the errors of the validation of a format (might be any kind of format) @param ln language @param errors a list of tuples (error code, string error message) """ _ = gettext_set_language(ln) # load the right message language out = "" if len(errors) == 0: out += '''<span style="color: rgb(0, 255, 0);" >%s.</span>''' % _('No problem found with format') elif len(errors) == 1: out += '''<span style="color: rgb(255, 0, 0);" >%s:</span><br/>''' % _('An error has been found') else: out += '''<span style="color: rgb(255, 0, 0);" >%s:</span><br/>''' % _('The following errors have been found') for error in errors: out += error + "<br/>" return indent_text(out) def tmpl_admin_dialog_box(self, url, ln, title, message, options): """ Prints a dialog box with given title, message and options @param url the url of the page that must process the result of the dialog box @param ln language @param title the title of the dialog box @param message a formatted message to display inside dialog box @param options a list of string options to display as button to the user """ out = "" out += ''' <div style="text-align:center;"> <fieldset style="display:inline;margin-left:auto;margin-right:auto;"> <legend>%(title)s:</legend> <p>%(message)s</p> <form method="post" action="%(url)s"> ''' % {'title':title, 'message':message, 'url':url} for option in options: out += '''<input type="submit" class="adminbutton" name="chosen_option" value="%(value)s" /> ''' % {'value':option} out += '''</form></fieldset></div>''' return indent_text(out) diff --git a/modules/bibformat/lib/bibformatadminlib.py b/modules/bibformat/lib/bibformatadminlib.py index 5d27e1424..a35876f50 100644 --- a/modules/bibformat/lib/bibformatadminlib.py +++ b/modules/bibformat/lib/bibformatadminlib.py @@ -1,1474 +1,1476 @@ # -*- coding: utf-8 -*- ## $Id$ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Handle requests from the web interface to configure BibFormat. """ __lastupdated__ = """$Date$""" import os import re import stat import time from invenio.config import cdslang, weburl, etcdir from invenio.bibformat_config import templates_path, outputs_path, elements_path, format_template_extension from invenio.urlutils import wash_url_argument from invenio.errorlib import get_msgs_for_code_list from invenio.messages import gettext_set_language, wash_language, language_list_long from invenio.search_engine import perform_request_search, encode_for_xml from invenio import bibformat_dblayer from invenio import bibformat_engine import invenio.template bibformat_templates = invenio.template.load('bibformat') def getnavtrail(previous = '', ln=cdslang): """Get the navtrail""" previous = wash_url_argument(previous, 'str') ln = wash_language(ln) _ = gettext_set_language(ln) navtrail = '''<a class=navtrail href="%s/admin/index.%s.html">%s</a> > <a class=navtrail href="%s/admin/bibformat/bibformatadmin.py/?ln=%s">%s</a> ''' % (weburl, ln, _("Admin Area"), weburl, ln, _("BibFormat Admin")) navtrail = navtrail + previous return navtrail def perform_request_index(ln=cdslang, warnings=None, is_admin=False): """ Returns the main BibFormat admin page. This is the only page where the code needs to be cleaned when the migration kit will be removed. #TODO: remove when removing migration_kit @param ln language @param warnings a list of messages to display at top of the page, that prevents writability in etc @param is_admin indicate if user is authorized to use BibFormat @return the main admin page """ if warnings != None and len(warnings) > 0: warnings = get_msgs_for_code_list(warnings, 'warning', ln) warnings = [x[1] for x in warnings] # Get only message, not code return bibformat_templates.tmpl_admin_index(ln, warnings, is_admin) def perform_request_format_templates_management(ln=cdslang, checking=0): """ Returns the main management console for format templates @param ln language @param checking the level of checking (0: basic, 1:extensive (time consuming) ) @return the main page for format templates management """ #Reload in case a format was changed bibformat_engine.clear_caches() #get formats lists of attributes formats = bibformat_engine.get_format_templates(with_attributes=True) formats_attrs = [] for filename in formats: attrs = formats[filename]['attrs'] attrs['filename'] = filename attrs['editable'] = can_write_format_template(filename) path = templates_path + os.sep + filename attrs['last_mod_date'] = time.ctime(os.stat(path)[stat.ST_MTIME]) status = check_format_template(filename, checking) if len(status) > 1 or (len(status)==1 and status[0][0] != 'ERR_BIBFORMAT_CANNOT_READ_TEMPLATE_FILE'): status = ''' <a style="color: rgb(255, 0, 0);" href="%(weburl)s/admin/bibformat/bibformatadmin.py/validate_format?ln=%(ln)s&bft=%(bft)s">Not OK</a> ''' % {'weburl':weburl, 'ln':ln, 'bft':filename} else: status = '<span style="color: rgb(0, 255, 0);">OK</span>' attrs['status'] = status formats_attrs.append(attrs) def sort_by_attr(seq): intermed = [ (x['name'], i, x) for i, x in enumerate(seq)] intermed.sort() return [x[-1] for x in intermed] sorted_format_templates = sort_by_attr(formats_attrs) return bibformat_templates.tmpl_admin_format_templates_management(ln, sorted_format_templates) def perform_request_format_template_show(bft, ln=cdslang, code=None, ln_for_preview=cdslang, pattern_for_preview="", content_type_for_preview="text/html"): """ Returns the editor for format templates. @param ln language @param bft the template to edit @param code, the code being edited @param ln_for_preview the language for the preview (for bfo) @param pattern_for_preview the search pattern to be used for the preview (for bfo) @return the main page for formats management """ format_template = bibformat_engine.get_format_template(filename=bft, with_attributes=True) #Either use code being edited, or the original code inside template if code == None: code = format_template['code']#.replace('%%','%') #.replace("<","<").replace(">","/>").replace("&","&") #Build a default pattern if it is empty if pattern_for_preview == "": recIDs = perform_request_search() if len(recIDs) > 0: recID = recIDs[0] pattern_for_preview = "recid:%s" % recID editable = can_write_format_template(bft) #Look for all existing content_types content_types = bibformat_dblayer.get_existing_content_types() return bibformat_templates.tmpl_admin_format_template_show(ln, format_template['attrs']['name'], format_template['attrs']['description'], code, bft, ln_for_preview=ln_for_preview, pattern_for_preview=pattern_for_preview, editable=editable, content_type_for_preview=content_type_for_preview, content_types=content_types) def perform_request_format_template_show_dependencies(bft, ln=cdslang): """ Show the dependencies (on elements) of the given format. @param ln language @param bft the filename of the template to show """ format_template = bibformat_engine.get_format_template(filename=bft, with_attributes=True) name = format_template['attrs']['name'] output_formats = get_outputs_that_use_template(bft) format_elements = get_elements_used_by_template(bft) tags = [] for output_format in output_formats: for tag in output_format['tags']: tags.append(tag) for format_element in format_elements: for tag in format_element['tags']: tags.append(tag) tags.sort() return bibformat_templates.tmpl_admin_format_template_show_dependencies(ln, name, bft, output_formats, format_elements, tags) def perform_request_format_template_show_attributes(bft, ln=cdslang): """ Page for template name and descrition attributes edition. @param ln language @param bft the template to edit @return the main page for format templates attributes edition """ format_template = bibformat_engine.get_format_template(filename=bft, with_attributes=True) name = format_template['attrs']['name'] description = format_template['attrs']['description'] editable = can_write_format_template(bft) return bibformat_templates.tmpl_admin_format_template_show_attributes(ln, name, description, bft, editable) def perform_request_format_template_show_short_doc(ln=cdslang, search_doc_pattern=""): """ Returns the format elements documentation to be included inside format templated editor. Keep only elements that have 'search_doc_pattern' text inside description, if pattern not empty @param ln language @param search_doc_pattern a search pattern that specified which elements to display @return a brief version of the format element documentation """ #get format elements lists of attributes elements = bibformat_engine.get_format_elements(with_built_in_params=True) keys = elements.keys() keys.sort() elements = map(elements.get, keys) def filter_elem(element): """Keep element if is string representation contains all keywords of search_doc_pattern, and if its name does not start with a number (to remove 'garbage' from elements in tags table)""" if element['type'] != 'python' and \ element['attrs']['name'][0] in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']: return False text = str(element).upper() #Basic text representation if search_doc_pattern != "": for word in search_doc_pattern.split(): if word.upper() != "AND" and text.find(word.upper()) == -1: return False return True elements = filter(filter_elem, elements) return bibformat_templates.tmpl_admin_format_template_show_short_doc(ln, elements) def perform_request_format_elements_documentation(ln=cdslang): """ Returns the main management console for format elements. Includes list of format elements and associated administration tools. @param ln language @return the main page for format elements management """ #get format elements lists of attributes elements = bibformat_engine.get_format_elements(with_built_in_params=True) keys = elements.keys() keys.sort() elements = map(elements.get, keys) #Remove all elements found in table and that begin with a number (to remove 'garbage') filtered_elements = [element for element in elements if element['type'] == 'python' or \ element['attrs']['name'][0] not in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']] return bibformat_templates.tmpl_admin_format_elements_documentation(ln, filtered_elements) def perform_request_format_element_show_dependencies(bfe, ln=cdslang): """ Show the dependencies of the given format. @param ln language @param bfe the filename of the format element to show """ format_templates = get_templates_that_use_element(bfe) tags = get_tags_used_by_element(bfe) return bibformat_templates.tmpl_admin_format_element_show_dependencies(ln, bfe, format_templates, tags) def perform_request_format_element_test(bfe, ln=cdslang, param_values=None, uid=None): """ Show the dependencies of the given format. 'param_values' is the list of values to pass to 'format' function of the element as parameters, in the order ... If params is None, this means that they have not be defined by user yet. @param ln language @param bfe the name of the format element to show @param params the list of parameters to pass to element format function @param uid the user id for this request """ _ = gettext_set_language(ln) format_element = bibformat_engine.get_format_element(bfe, with_built_in_params=True) #Load parameter names and description ## param_names = [] param_descriptions = [] #First value is a search pattern to choose the record param_names.append(_("Test with record:")) # Caution: keep in sync with same text below param_descriptions.append(_("Enter a search query here.")) #Parameters defined in this element for param in format_element['attrs']['params']: param_names.append(param['name']) param_descriptions.append(param['description']) #Parameters common to all elements of a kind for param in format_element['attrs']['builtin_params']: param_names.append(param['name']) param_descriptions.append(param['description']) #Load parameters values ## if param_values == None: #First time the page is loaded param_values = [] #Propose an existing record id by default recIDs = perform_request_search() if len(recIDs) > 0: recID = recIDs[0] param_values.append("recid:%s" % recID) #Default values defined in this element for param in format_element['attrs']['params']: param_values.append(param['default']) #Parameters common to all elements of a kind for param in format_element['attrs']['builtin_params']: param_values.append(param['default']) #Execute element with parameters ## params = dict(zip(param_names, param_values)) #Find a record corresponding to search pattern search_pattern = params[_("Test with record:")] # Caution keep in sync with same text above and below recIDs = perform_request_search(p=search_pattern) del params[_("Test with record:")] # Caution keep in sync with same text above if len(recIDs) > 0: bfo = bibformat_engine.BibFormatObject(recIDs[0], ln, search_pattern, None, uid) (result, errors) = bibformat_engine.eval_format_element(format_element, bfo, params) else: result = get_msgs_for_code_list([("ERR_BIBFORMAT_NO_RECORD_FOUND_FOR_PATTERN", search_pattern)], file='error', ln=cdslang)[0][1] return bibformat_templates.tmpl_admin_format_element_test(ln, bfe, format_element['attrs']['description'], param_names, param_values, param_descriptions, result) def perform_request_output_formats_management(ln=cdslang, sortby="code"): """ Returns the main management console for output formats. Includes list of output formats and associated administration tools. @param ln language @param sortby the sorting crieteria (can be 'code' or 'name') @return the main page for output formats management """ #Reload in case a format was changed bibformat_engine.clear_caches() #get output formats lists of attributes output_formats_list = bibformat_engine.get_output_formats(with_attributes=True) output_formats = {} for filename in output_formats_list: output_format = output_formats_list[filename] code = output_format['attrs']['code'] path = outputs_path + os.sep + filename output_format['editable'] = can_write_output_format(code) output_format['last_mod_date'] = time.ctime(os.stat(path)[stat.ST_MTIME]) + #Validate the output format status = check_output_format(code) + # If there is an error but the error is just 'format is not writable', do not display as error if len(status) > 1 or (len(status)==1 and status[0][0] != 'ERR_BIBFORMAT_CANNOT_WRITE_OUTPUT_FILE'): status = ''' <a style="color: rgb(255, 0, 0);" href="%(weburl)s/admin/bibformat/bibformatadmin.py/validate_format?ln=%(ln)s&bfo=%(bfo)s">Not OK</a> ''' % {'weburl':weburl, 'ln':ln, 'bfo':code} else: status = '<span style="color: rgb(0, 255, 0);">OK</span>' output_format['status'] = status output_formats[filename] = output_format #sort according to code or name, inspired from Python Cookbook def get_attr(dic, attr): if attr == "code": return dic['attrs']['code'] else: return dic['attrs']['names']['generic'] def sort_by_attr(seq, attr): intermed = [ (get_attr(x, attr), i, x) for i, x in enumerate(seq)] intermed.sort() return [x[-1] for x in intermed] if sortby != "code" and sortby != "name": sortby = "code" sorted_output_formats = sort_by_attr(output_formats.values(), sortby) return bibformat_templates.tmpl_admin_output_formats_management(ln, sorted_output_formats) def perform_request_output_format_show(bfo, ln=cdslang, r_fld=[], r_val=[], r_tpl=[], default="", r_upd="", args={}): """ Returns the editing tools for a given output format. The page either shows the output format from file, or from user's POST session, as we want to let him edit the rules without saving. Policy is: r_fld, r_val, rules_tpl are list of attributes of the rules. If they are empty, load from file. Else use POST. The i th value of each list is one of the attributes of rule i. Rule i is the i th rule in order of evaluation. All list have the same number of item. r_upd contains an action that has to be performed on rules. It can composed of a number (i, the rule we want to modify) and an operator : "save" to save the rules, "add" or "del". syntax: operator [number] For eg: r_upd = _("Save Changes") saves all rules (no int should be specified). For eg: r_upd = _("Add New Rule") adds a rule (no int should be specified). For eg: r_upd = _("Remove Rule") + " 5" deletes rule at position 5. The number is used only for operation delete. An action can also be in **args. We must look there for string starting with '(+|-) [number]' to increase (+) or decrease (-) a rule given by its index (number). For example "+ 5" increase priority of rule 5 (put it at fourth position). The string in **args can be followed by some garbage that looks like .x or .y, as this is returned as the coordinate of the click on the <input type="image">. We HAVE to use args and reason on its keys, because for <input> of type image, iexplorer does not return the value of the tag, but only the name. Action is executed only if we are working from user's POST session (means we must have loaded the output format first, which is totally normal and expected behaviour) IMPORTANT: we display rules evaluation index starting at 1 in interface, but we start internally at 0 @param ln language @param bfo the filename of the output format to show @param r_fld the list of 'field' attribute for each rule @param r_val the list of 'value' attribute for each rule @param r_tpl the list of 'template' attribute for each rule @param default the default format template used by this output format @param r_upd the rule that we want to increase/decrease in order of evaluation """ output_format = bibformat_engine.get_output_format(bfo, with_attributes=True) format_templates = bibformat_engine.get_format_templates(with_attributes=True) name = output_format['attrs']['names']['generic'] rules = [] debug = "" if len(r_fld) == 0 and r_upd=="": #Retrieve rules from file rules = output_format['rules'] default = output_format['default'] else: #Retrieve rules from given lists #Transform a single rule (not considered as a list with length #1 by the templating system) into a list if not isinstance(r_fld, list): r_fld = [r_fld] r_val = [r_val] r_tpl = [r_tpl] for i in range(len(r_fld)): rule = {'field': r_fld[i], 'value': r_val[i], 'template': r_tpl[i]} rules.append(rule) #Execute action _ = gettext_set_language(ln) if r_upd.startswith(_("Remove Rule")): #Remove rule index = int(r_upd.split(" ")[-1]) -1 del rules[index] elif r_upd.startswith(_("Save Changes")): #Save update_output_format_rules(bfo, rules, default) elif r_upd.startswith(_("Add New Rule")): #Add new rule rule = {'field': "", 'value': "", 'template': ""} rules.append(rule) else: #Get the action in 'args' #The action must be constructed from string of the kind: # + 5 or - 4 or + 5.x or -4.y for button_val in args.keys():#for all elements of form not handled yet action = button_val.split(" ") if action[0] == '-' or action[0] == '+': index = int(action[1].split(".")[0]) -1 if action[0] == '-': #Decrease priority rule = rules[index] del rules[index] rules.insert(index + 1, rule) #debug = 'Decrease rule '+ str(index) break elif action[0] == '+': #Increase priority rule = rules[index] del rules[index] rules.insert(index - 1, rule) #debug = 'Increase rule ' + str(index) break editable = can_write_output_format(bfo) return bibformat_templates.tmpl_admin_output_format_show(ln, bfo, name, rules, default, format_templates, editable) def perform_request_output_format_show_dependencies(bfo, ln=cdslang): """ Show the dependencies of the given format. @param ln language @param bfo the filename of the output format to show """ output_format = bibformat_engine.get_output_format(code=bfo, with_attributes=True) name = output_format['attrs']['names']['generic'] format_templates = get_templates_used_by_output(bfo) return bibformat_templates.tmpl_admin_output_format_show_dependencies(ln, name, bfo, format_templates) def perform_request_output_format_show_attributes(bfo, ln=cdslang): """ Page for output format names and description attributes edition. @param ln language @param bfo filename of output format to edit @return the main page for output format attributes edition """ output_format = bibformat_engine.get_output_format(code=bfo, with_attributes=True) name = output_format['attrs']['names']['generic'] description = output_format['attrs']['description'] content_type = output_format['attrs']['content_type'] #Get translated names. Limit to long names now. #Translation are given in order of languages in language_list_long() names_trans = [] for lang in language_list_long(): name_trans = output_format['attrs']['names']['ln'].get(lang[0], "") names_trans.append({'lang':lang[1], 'trans':name_trans}) editable = can_write_output_format(bfo) return bibformat_templates.tmpl_admin_output_format_show_attributes(ln, name, description, content_type, bfo, names_trans, editable) def perform_request_knowledge_bases_management(ln=cdslang): """ Returns the main page for knowledge bases management. @param ln language @return the main page for knowledge bases management """ kbs = bibformat_dblayer.get_kbs() return bibformat_templates.tmpl_admin_kbs_management(ln, kbs) def perform_request_knowledge_base_show(kb_id, ln=cdslang, sortby="to"): """ Show the content of a knowledge base @param ln language @param kb a knowledge base id @param sortby the sorting criteria ('from' or 'to') @return the content of the given knowledge base """ name = bibformat_dblayer.get_kb_name(kb_id) mappings = bibformat_dblayer.get_kb_mappings(name, sortby) return bibformat_templates.tmpl_admin_kb_show(ln, kb_id, name, mappings, sortby) def perform_request_knowledge_base_show_attributes(kb_id, ln=cdslang, sortby="to"): """ Show the attributes of a knowledge base @param ln language @param kb a knowledge base id @param sortby the sorting criteria ('from' or 'to') @return the content of the given knowledge base """ name = bibformat_dblayer.get_kb_name(kb_id) description = bibformat_dblayer.get_kb_description(name) return bibformat_templates.tmpl_admin_kb_show_attributes(ln, kb_id, name, description, sortby) def perform_request_knowledge_base_show_dependencies(kb_id, ln=cdslang, sortby="to"): """ Show the dependencies of a kb @param ln language @param kb a knowledge base id @param sortby the sorting criteria ('from' or 'to') @return the dependencies of the given knowledge base """ name = bibformat_dblayer.get_kb_name(kb_id) format_elements = get_elements_that_use_kb(name) return bibformat_templates.tmpl_admin_kb_show_dependencies(ln, kb_id, name, sortby, format_elements) def add_format_template(): """ Adds a new format template (mainly create file with unique name) @return the filename of the created format """ (filename, name) = bibformat_engine.get_fresh_format_template_filename("Untitled") out = '<name>%(name)s</name><description></description>' % {'name':name} path = templates_path + os.sep + filename format = open(path, 'w') format.write(out) format.close return filename def delete_format_template(filename): """ Delete a format template given by its filename If format template is not writable, do not remove @param filename the format template filename """ if not can_write_format_template(filename): return path = templates_path + os.sep + filename os.remove(path) bibformat_engine.clear_caches() def update_format_template_code(filename, code=""): """ Saves code inside template given by filename """ format_template = bibformat_engine.get_format_template_attrs(filename) name = format_template['name'] description = format_template['description'] out = ''' <name>%(name)s</name> <description>%(description)s</description> %(code)s ''' % {'name':name, 'description':description, 'code':code} path = templates_path + os.sep + filename format = open(path, 'w') format.write(out) format.close bibformat_engine.clear_caches() def update_format_template_attributes(filename, name="", description=""): """ Saves name and description inside template given by filename. the filename must change according to name, and every output format having reference to filename must be updated. If name already exist, use fresh filename (we never overwrite other templates) amd remove old one. @return the filename of the modified format """ format_template = bibformat_engine.get_format_template(filename, with_attributes=True) code = format_template['code'] if format_template['attrs']['name'] != name: #name has changed, so update filename old_filename = filename old_path = templates_path + os.sep + old_filename #Remove old one os.remove(old_path) (filename, name) = bibformat_engine.get_fresh_format_template_filename(name) #Change output formats that calls this template output_formats = bibformat_engine.get_output_formats() for output_format_filename in output_formats: if can_read_output_format(output_format_filename) and can_write_output_format(output_format_filename): output_path = outputs_path + os.sep + output_format_filename format = open(output_path, 'r') output_text = format.read() format.close output_pattern = re.compile("---(\s)*" + old_filename, re.IGNORECASE) mod_output_text = output_pattern.sub("--- " + filename, output_text) if output_text != mod_output_text: format = open(output_path, 'w') format.write(mod_output_text) format.close #Write updated format template out = '''<name>%(name)s</name><description>%(description)s</description>%(code)s''' % {'name':name, 'description':description, 'code':code} path = templates_path + os.sep + filename format = open(path, 'w') format.write(out) format.close bibformat_engine.clear_caches() return filename def add_output_format(): """ Adds a new output format (mainly create file with unique name) @return the code of the created format """ (filename, code) = bibformat_engine.get_fresh_output_format_filename("UNTLD") #Add entry in database bibformat_dblayer.add_output_format(code) bibformat_dblayer.set_output_format_name(code, "Untitled", lang="generic") bibformat_dblayer.set_output_format_content_type(code, "text/html") #Add file out = "" path = outputs_path + os.sep + filename format = open(path, 'w') format.write(out) format.close return code def delete_output_format(code): """ Delete a format template given by its code if file is not writable, don't remove @param code the 6 letters code of the output format to remove """ if not can_write_output_format(code): return #Remove entry from database bibformat_dblayer.remove_output_format(code) #Remove file filename = bibformat_engine.resolve_output_format_filename(code) path = outputs_path + os.sep + filename os.remove(path) bibformat_engine.clear_caches() def update_output_format_rules(code, rules=[], default=""): """ Saves rules inside output format given by code """ #Generate output format syntax #Try to group rules by field previous_field = "" out = "" for rule in rules: field = rule["field"] value = rule["value"] template = rule["template"] if previous_field != field: out += "tag %s:\n" % field out +="%(value)s --- %(template)s\n" % {'value':value, 'template':template} previous_field = field out += "default: %s" % default filename = bibformat_engine.resolve_output_format_filename(code) path = outputs_path + os.sep + filename format = open(path, 'w') format.write(out) format.close bibformat_engine.clear_caches() def update_output_format_attributes(code, name="", description="", new_code="", content_type="", names_trans=[]): """ Saves name and description inside output format given by filename. If new_code already exist, use fresh code (we never overwrite other output). @param description the new description @param name the new name @param code the new short code (== new bfo) of the output format @param code the code of the output format to update @param names_trans the translations in the same order as the languages from get_languages() @param content_type the new content_type of the output format @return the filename of the modified format """ bibformat_dblayer.set_output_format_description(code, description) bibformat_dblayer.set_output_format_content_type(code, content_type) bibformat_dblayer.set_output_format_name(code, name, lang="generic") i = 0 for lang in language_list_long(): if names_trans[i] != "": bibformat_dblayer.set_output_format_name(code, names_trans[i], lang[0]) i += 1 new_code = new_code.upper() if code != new_code: #If code has changed, we must update filename with a new unique code old_filename = bibformat_engine.resolve_output_format_filename(code) old_path = outputs_path + os.sep + old_filename (new_filename, new_code) = bibformat_engine.get_fresh_output_format_filename(new_code) new_path = outputs_path + os.sep + new_filename os.rename(old_path, new_path) bibformat_dblayer.change_output_format_code(code, new_code) bibformat_engine.clear_caches() return new_code def add_kb_mapping(kb_name, key, value=""): """ Adds a new mapping to given kb @param kb_name the name of the kb where to insert the new value @param key the key of the mapping @param value the value of the mapping """ bibformat_dblayer.add_kb_mapping(kb_name, key, value) def remove_kb_mapping(kb_name, key): """ Delete an existing kb mapping in kb @param kb_name the name of the kb where to insert the new value @param key the key of the mapping """ bibformat_dblayer.remove_kb_mapping(kb_name, key) def update_kb_mapping(kb_name, old_key, key, value): """ Update an existing kb mapping with key old_key with a new key and value @param kb_name the name of the kb where to insert the new value @param the key of the mapping in the kb @param key the new key of the mapping @param value the new value of the mapping """ bibformat_dblayer.update_kb_mapping(kb_name, old_key, key, value) def get_kb_name(kb_id): """ Returns the name of the kb given by id """ return bibformat_dblayer.get_kb_name(kb_id) def update_kb_attributes(kb_name, new_name, new_description): """ Updates given kb_name with a new name and new description @param kb_name the name of the kb to update @param new_name the new name for the kb @param new_description the new description for the kb """ bibformat_dblayer.update_kb(kb_name, new_name, new_description) def add_kb(kb_name="Untitled"): """ Adds a new kb in database, and returns its id The name of the kb will be 'Untitled#' such that it is unique. @param kb_name the name of the kb @return the id of the newly created kb """ name = kb_name i = 1 while bibformat_dblayer.kb_exists(name): name = kb_name + " " + str(i) i += 1 kb_id = bibformat_dblayer.add_kb(name, "") return kb_id def delete_kb(kb_name): """ Deletes given kb from database """ bibformat_dblayer.delete_kb(kb_name) def can_read_format_template(filename): """ Returns 0 if we have read permission on given format template, else returns other integer """ path = "%s%s%s" % (templates_path, os.sep, filename) return os.access(path, os.R_OK) def can_read_output_format(bfo): """ Returns 0 if we have read permission on given output format, else returns other integer """ filename = bibformat_engine.resolve_output_format_filename(bfo) path = "%s%s%s" % (outputs_path, os.sep, filename) return os.access(path, os.R_OK) def can_read_format_element(name): """ Returns 0 if we have read permission on given format element, else returns other integer """ filename = bibformat_engine.resolve_format_element_filename(name) path = "%s%s%s" % (elements_path, os.sep, filename) return os.access(path, os.R_OK) def can_write_format_template(bft): """ Returns 0 if we have write permission on given format template, else returns other integer """ if not can_read_format_template(bft): return False path = "%s%s%s" % (templates_path, os.sep, bft) return os.access(path, os.W_OK) def can_write_output_format(bfo): """ Returns 0 if we have write permission on given output format, else returns other integer """ if not can_read_output_format(bfo): return False filename = bibformat_engine.resolve_output_format_filename(bfo) path = "%s%s%s" % (outputs_path, os.sep, filename) return os.access(path, os.W_OK) def can_write_etc_bibformat_dir(): """ Returns true if we can write in etc/bibformat dir. """ path = "%s%sbibformat" % (etcdir, os.sep) return os.access(path, os.W_OK) def get_outputs_that_use_template(filename): """ Returns a list of output formats that call the given format template. The returned output formats also give their dependencies on tags. We don't return the complete output formats but some reference to them (filename + names) [ {'filename':"filename_1.bfo" 'names': {'en':"a name", 'fr': "un nom", 'generic':"a name"} 'tags': ['710__a', '920__'] }, ... ] Returns output formats references sorted by (generic) name @param filename a format template filename """ output_formats_list = {} tags = [] output_formats = bibformat_engine.get_output_formats(with_attributes=True) for output_format in output_formats: name = output_formats[output_format]['attrs']['names']['generic'] #First look at default template, and add it if necessary if output_formats[output_format]['default'] == filename: output_formats_list[name] = {'filename':output_format, 'names':output_formats[output_format]['attrs']['names'], 'tags':[]} #Second look at each rule found = False for rule in output_formats[output_format]['rules']: if rule['template'] == filename: found = True tags.append(rule['field']) #Also build dependencies on tags #Finally add dependency on template from rule (overwrite default dependency, #which is weaker in term of tag) if found == True: output_formats_list[name] = {'filename':output_format, 'names':output_formats[output_format]['attrs']['names'], 'tags':tags} keys = output_formats_list.keys() keys.sort() return map(output_formats_list.get, keys) def get_elements_used_by_template(filename): """ Returns a list of format elements that are called by the given format template. The returned elements also give their dependencies on tags The list is returned sorted by name [ {'filename':"filename_1.py" 'name':"filename_1" 'tags': ['710__a', '920__'] }, ... ] Returns elements sorted by name @param filename a format template filename """ format_elements = {} format_template = bibformat_engine.get_format_template(filename=filename, with_attributes=True) code = format_template['code'] format_elements_iter = bibformat_engine.pattern_tag.finditer(code) for result in format_elements_iter: function_name = result.group("function_name").lower() if function_name != None and not format_elements.has_key(function_name): filename = bibformat_engine.resolve_format_element_filename("BFE_"+function_name) if filename != None: tags = get_tags_used_by_element(filename) format_elements[function_name] = {'name':function_name.lower(), 'filename':filename, 'tags':tags} keys = format_elements.keys() keys.sort() return map(format_elements.get, keys) # Format Elements Dependencies ## def get_tags_used_by_element(filename): """ Returns a list of tags used by given format element APPROXIMATIVE RESULTS: the tag are retrieved in field(), fields() and control_field() function. If they are used computed, or saved in a variable somewhere else, they are not retrieved @TODO: There is room for improvements. For example catch call to BibRecord functions, or use of <BFE_FIELD tag=""/> Returns tags sorted by value @param filename a format element filename """ tags = {} format_element = bibformat_engine.get_format_element(filename) if format_element == None: return [] elif format_element['type']=="field": tags = format_element['attrs']['tags'] return tags filename = bibformat_engine.resolve_format_element_filename(filename) path = elements_path + os.sep + filename format = open(path, 'r') code = format.read() format.close tags_pattern = re.compile(''' (field|fields|control_field)\s* #Function call $\s* #Opening parenthesis [\'"]+ #Single or double quote (?P<tag>.+?) #Tag [\'"]+\s* #Single or double quote $ #Closing parenthesis ''', re.VERBOSE | re.MULTILINE) tags_iter = tags_pattern.finditer(code) for result in tags_iter: tags[result.group("tag")] = result.group("tag") return tags.values() def get_templates_that_use_element(name): """ Returns a list of format templates that call the given format element. The returned format templates also give their dependencies on tags. [ {'filename':"filename_1.bft" 'name': "a name" 'tags': ['710__a', '920__'] }, ... ] Returns templates sorted by name @param name a format element name """ format_templates = {} tags = [] files = os.listdir(templates_path) #Retrieve all templates for file in files: if file.endswith(format_template_extension): format_elements = get_elements_used_by_template(file) #Look for elements used in template format_elements = map(lambda x: x['name'].lower(), format_elements) try: #Look for element format_elements.index(name.lower()) #If not found, get out of "try" statement format_template = bibformat_engine.get_format_template(filename=file, with_attributes=True) template_name = format_template['attrs']['name'] format_templates[template_name] = {'name':template_name, 'filename':file} except: print name+" not found in "+str(format_elements) pass keys = format_templates.keys() keys.sort() return map(format_templates.get, keys) # Output Formats Dependencies ## def get_templates_used_by_output(code): """ Returns a list of templates used inside an output format give by its code The returned format templates also give their dependencies on elements and tags [ {'filename':"filename_1.bft" 'name': "a name" 'elements': [{'filename':"filename_1.py", 'name':"filename_1", 'tags': ['710__a', '920__'] }, ...] }, ... ] Returns templates sorted by name """ format_templates = {} output_format = bibformat_engine.get_output_format(code, with_attributes=True) filenames = map(lambda x: x['template'], output_format['rules']) if output_format['default'] != "": filenames.append(output_format['default']) for filename in filenames: template = bibformat_engine.get_format_template(filename, with_attributes=True) name = template['attrs']['name'] elements = get_elements_used_by_template(filename) format_templates[name] = {'name':name, 'filename':filename, 'elements':elements} keys = format_templates.keys() keys.sort() return map(format_templates.get, keys) # Knowledge Bases Dependencies ## def get_elements_that_use_kb(name): """ Returns a list of elements that call given kb [ {'filename':"filename_1.py" 'name': "a name" }, ... ] Returns elements sorted by name """ format_elements = {} files = os.listdir(elements_path) #Retrieve all elements in files for filename in files: if filename.endswith(".py"): path = elements_path + os.sep + filename format = open(path, 'r') code = format.read() format.close #Search for use of kb inside code kb_pattern = re.compile(''' (bfo.kb)\s* #Function call \(\s* #Opening parenthesis [\'"]+ #Single or double quote (?P<kb>%s) #kb [\'"]+\s* #Single or double quote , #comma ''' % name, re.VERBOSE | re.MULTILINE | re.IGNORECASE) result = kb_pattern.search(code) if result != None: name = ("".join(filename.split(".")[:-1])).lower() if name.startswith("bfe_"): name = name[4:] format_elements[name] = {'filename':filename, 'name': name} keys = format_elements.keys() keys.sort() return map(format_elements.get, keys) # Validation tools ## def perform_request_format_validate(ln=cdslang, bfo=None, bft=None, bfe=None): """ Returns a page showing the status of an output format or format template or format element. This page is called from output formats management page or format template management page or format elements documentation. The page only shows the status of one of the format, depending on the specified one. If multiple are specified, shows the first one. @param ln language @param bfo an output format 6 chars code @param bft a format element filename @param bfe a format element name """ if bfo != None: errors = check_output_format(bfo) messages = get_msgs_for_code_list(code_list = errors, ln=ln) elif bft != None: errors = check_format_template(bft, checking=1) messages = get_msgs_for_code_list(code_list = errors, ln=ln) elif bfe != None: errors = check_format_element(bfe) messages = get_msgs_for_code_list(code_list = errors, ln=ln) if messages == None: messages = [] messages = map(lambda x: encode_for_xml(x[1]), messages) return bibformat_templates.tmpl_admin_validate_format(ln, messages) def check_output_format(code): """ Returns the list of errors in the output format given by code The errors are the formatted errors defined in bibformat_config.py file. @param code the 6 chars code of the output format to check @return a list of errors """ errors = [] filename = bibformat_engine.resolve_output_format_filename(code) if can_read_output_format(code): path = outputs_path + os.sep + filename format = open(path) current_tag = '' i = 0 for line in format: i += 1 if line.strip() == "": #ignore blank lines continue clean_line = line.rstrip("\n\r ") #remove spaces and eol if line.strip().endswith(":") or (line.strip().lower().startswith("tag") and line.find('---') == -1): #check tag if not clean_line.endswith(":"): #column misses at the end of line errors.append(("ERR_BIBFORMAT_OUTPUT_RULE_FIELD_COL", line, i)) if not clean_line.lower().startswith("tag"): #tag keyword is missing errors.append(("ERR_BIBFORMAT_OUTPUT_TAG_MISSING", line, i)) elif not clean_line.startswith("tag"): #tag was not lower case errors.append(("ERR_BIBFORMAT_OUTPUT_WRONG_TAG_CASE", line, i)) clean_line = clean_line.rstrip(": ") #remove : and spaces at the end of line current_tag = "".join(clean_line.split()[1:]).strip() #the tag starts at second position if len(clean_line.split()) > 2: #We should only have 'tag' keyword and tag errors.append(("ERR_BIBFORMAT_INVALID_OUTPUT_RULE_FIELD", i)) else: if len(check_tag(current_tag)) > 0: #Invalid tag errors.append(("ERR_BIBFORMAT_INVALID_OUTPUT_RULE_FIELD_tag", current_tag, i)) if not clean_line.startswith("tag"): errors.append(("ERR_BIBFORMAT_INVALID_OUTPUT_RULE_FIELD", i)) elif line.find('---') != -1: #check condition if current_tag == "": errors.append(("ERR_BIBFORMAT_OUTPUT_CONDITION_OUTSIDE_FIELD", line, i)) words = line.split('---') if len(words) != 2: errors.append(("ERR_BIBFORMAT_INVALID_OUTPUT_CONDITION", line, i)) template = words[-1].strip() path = templates_path + os.sep + template if not os.path.exists(path): errors.append(("ERR_BIBFORMAT_WRONG_OUTPUT_RULE_TEMPLATE_REF", template, i)) elif line.find(':') != -1 or (line.strip().lower().startswith("default") and line.find('---') == -1): #check default template clean_line = line.strip() if line.find(':') == -1: #column misses after default errors.append(("ERR_BIBFORMAT_OUTPUT_RULE_DEFAULT_COL", line, i)) if not clean_line.startswith("default"): #default keyword is missing errors.append(("ERR_BIBFORMAT_OUTPUT_DEFAULT_MISSING", line, i)) if not clean_line.startswith("default"): #default was not lower case errors.append(("ERR_BIBFORMAT_OUTPUT_WRONG_DEFAULT_CASE", line, i)) default = "".join(line.split(':')[1]).strip() path = templates_path + os.sep + default if not os.path.exists(path): errors.append(("ERR_BIBFORMAT_WRONG_OUTPUT_RULE_TEMPLATE_REF", default, i)) else: #check others errors.append(("ERR_BIBFORMAT_WRONG_OUTPUT_LINE", line, i)) else: errors.append(("ERR_BIBFORMAT_CANNOT_READ_OUTPUT_FILE", filename, "")) return errors def check_format_template(filename, checking=0): """ Returns the list of errors in the format template given by its filename The errors are the formatted errors defined in bibformat_config.py file. @param filename the filename of the format template to check @param checking the level of checking (0:basic, >=1 extensive (time-consuming)) @return a list of errors """ errors = [] if can_read_format_template(filename):#Can template be read? #format_template = bibformat_engine.get_format_template(filename, with_attributes=True) format = open("%s%s%s" % (templates_path, os.sep, filename)) code = format.read() format.close() #Look for name match = bibformat_engine.pattern_format_template_name.search(code) if match == None:#Is tag <name> defined in template? errors.append(("ERR_BIBFORMAT_TEMPLATE_HAS_NO_NAME", filename)) #Look for description match = bibformat_engine.pattern_format_template_desc.search(code) if match == None:#Is tag <description> defined in template? errors.append(("ERR_BIBFORMAT_TEMPLATE_HAS_NO_DESCRIPTION", filename)) format_template = bibformat_engine.get_format_template(filename, with_attributes=False) code = format_template['code'] #Look for calls to format elements #Check existence of elements and attributes used in call elements_call = bibformat_engine.pattern_tag.finditer(code) for element_match in elements_call: element_name = element_match.group("function_name") filename = bibformat_engine.resolve_format_element_filename(element_name) if filename == None and not bibformat_dblayer.tag_exists_for_name(element_name): #Is element defined? errors.append(("ERR_BIBFORMAT_TEMPLATE_CALLS_UNDEFINED_ELEM", filename, element_name)) else: format_element = bibformat_engine.get_format_element(element_name, with_built_in_params=True) if format_element == None:#Can element be loaded? if not can_read_format_element(element_name): errors.append(("ERR_BIBFORMAT_TEMPLATE_CALLS_UNREADABLE_ELEM", filename, element_name)) else: errors.append(("ERR_BIBFORMAT_TEMPLATE_CALLS_UNLOADABLE_ELEM", element_name, filename)) else: #are the parameters used defined in element? params_call = bibformat_engine.pattern_function_params.finditer(element_match.group()) all_params = {} for param_match in params_call: param = param_match.group("param") value = param_match.group("value") all_params[param] = value allowed_params = [] #Built-in params for allowed_param in format_element['attrs']['builtin_params']: allowed_params.append(allowed_param['name']) #Params defined in element for allowed_param in format_element['attrs']['params']: allowed_params.append(allowed_param['name']) if not param in allowed_params: errors.append(("ERR_BIBFORMAT_TEMPLATE_WRONG_ELEM_ARG", element_name, param, filename)) # The following code is too much time consuming. Only do where really requested if checking > 0: #Try to evaluate, with any object and pattern recIDs = perform_request_search() if len(recIDs) > 0: recID = recIDs[0] bfo = bibformat_engine.BibFormatObject(recID, search_pattern="Test") (result, errors_) = bibformat_engine.eval_format_element(format_element, bfo, all_params, verbose=7) errors.extend(errors_) else:#Template cannot be read errors.append(("ERR_BIBFORMAT_CANNOT_READ_TEMPLATE_FILE", filename, "")) return errors def check_format_element(name): """ Returns the list of errors in the format element given by its name The errors are the formatted errors defined in bibformat_config.py file. @param name the name of the format element to check @return a list of errors """ errors = [] filename = bibformat_engine.resolve_format_element_filename(name) if filename != None:#Can element be found in files? if can_read_format_element(name):#Can element be read? #Try to load try: module_name = filename if module_name.endswith(".py"): module_name = module_name[:-3] module = __import__("invenio.bibformat_elements."+module_name) function_format = module.bibformat_elements.__dict__[module_name].format #Try to evaluate, with any object and pattern recIDs = perform_request_search() if len(recIDs) > 0: recID = recIDs[0] bfo = bibformat_engine.BibFormatObject(recID, search_pattern="Test") element = bibformat_engine.get_format_element(name) (result, errors_) = bibformat_engine.eval_format_element(element, bfo, verbose=7) errors.extend(errors_) except Exception, e: errors.append(("ERR_BIBFORMAT_IN_FORMAT_ELEMENT", name, e)) else: errors.append(("ERR_BIBFORMAT_CANNOT_READ_ELEMENT_FILE", filename, "")) elif bibformat_dblayer.tag_exists_for_name(name):#Can element be found in database? pass else: errors.append(("ERR_BIBFORMAT_CANNOT_RESOLVE_ELEMENT_NAME", name)) return errors def check_tag(tag): """ Checks the validity of a tag """ errors = [] return errors

Output Format
Template		Template
Format Element	Format Element	Format Element	Format Element

Contents

Differences between old and new BibFormat

Migrating behaviours to output formats

Migrating formats to format templates and format elements

Migrating Knowledge Bases

Migrating UDFs and Link rules

The Migration Kit

Why do we need output formats? Wouldn't format templates be sufficient?

How can I protect a format?

Why cannot I edit/delete a format?

How can I add a format element from the web interface?

Why are some Marc codes omitted in the "Check Dependencies" pages?

How are displayed deleted record?

Why are some format elements omitted in the "Knowledge Base Dependencies" page?

Why are some format elements defined in field table omitted in the format element documentation?

Old PHP BibFormat Aministration Guide

Contents

1. Overview

2. Configuring BibFormat

3. Running BibFormat

3.1. From the Web interface

3.2. From the command-line interface

4. Detailed Configuration Manual

4.1. About BibFormat

4.2. How it works?

4.3. A first look at the web configuration interface

4.4. Mapping the input (OAI Extraction Rules)

4.5. Defining output types: Behaviors

4.6. Formats

4.7. Knowledge bases (KBs)

4.8. User Defined Functions (UDFs)

4.9. Defining links

4.9.1. EXTERNAL link conditions

4.9.2. INTERNAL link conditions

4.9.2 Example

4.10. User management

4.11. Evaluation Language Reference

Old BibFormat admin interface (in gray box)

Configuring BibFormat

Running BibFormat

From the Web interface

From the command-line interface