-r, --replace Replace existing records by those from the XML
MARC file. The original content is wiped out
and fully replaced. Signals error if record
is not found via matching record IDs or system
numbers.
Fields defined in Invenio config variable
CFG_BIBUPLOAD_STRONG_TAGS are not replaced.
Note also that `-r' can be combined with `-i'
into an `-ir' option that would automatically
either insert records as new if they are not
found in the system, or correct existing
records if they are found to exist.
-a, --append Append fields from XML MARC file at the end of
existing records. The original content is
enriched only. Signals error if record is not
found via matching record IDs or system
numbers.
-c, --correct Correct fields of existing records by those
from XML MARC file. The original record
content is modified only on those fields from
the XML MARC file where both the tags and the
indicators match: the original fields are
removed and replaced by those from the XML
MARC file. Fields not present in XML MARC
file are not changed (unlike the -r option).
Fields with "provenance" subfields defined in
'CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS'
are protected against deletion unless the
input MARCXML contains a matching
provenance value.
Signals error if record is not found via
matching record IDs or system numbers.
-d, --delete Delete fields of existing records that are
contained in the XML MARC file. The fields in
the original record that are not present in
the XML MARC file are preserved.
This is incompatible with FFT (see below).
</pre>
</p>
<p>If you combine the <code>--pretend</code> parameter with the above updating modes you can actually test what would be executed without modifying the database or altering the system status.</p>
<a name="3.4"></a><h3>3.4 Inserting and updating at the same time</h3>
<p>Note that the insert/update modes can be combined together. For
example, if you have a file that contains a mixture of new records
with possibly some records to be updated, then you can run:
<blockquote>
<pre>
$ bibupload -i -r file.xml
<pre>
</blockquote>
In this case BibUpload will try to do an update (for records having
either 001 or 970 identifiers), or an insert (for the other ones).
This is the fulltext version of my thesis in the PDF format.
Chapter 5 still needs some revision.
</subfield>
</datafield>
</record>
</pre>
</p>
<p>The FFT tag can be repetitive, so one can pass along another FFT
tag instance containing a pointer to e.g. the thesis defence slides.
The subfields of an FFT tag are non-repetitive.
</p>
<p>When more than one FFT tag is specified for the same document
(e.g. for adding more than one format at a time), if $t (docfile
type), $m (new desired docfile name), $r (restriction), $v (version),
$x (url/path for an icon), are specified, they should be identically
specified for each single entry of FFT. E.g. if you want to specify
an icon for a document with two formats (say .pdf and .doc), you'll
write two FFT tags, both containing the same $x subfield.</p>
<p>The bibupload process, when it encounters FFT tags, will
automatically populate fulltext storage space
(<code>/opt/invenio/var/data/files</code>) and metadata record
associated tables (<code>bibrec_bibdoc</code>, <code>bibdoc</code>) as
appropriate. It will also enrich the 856 tags (URL tags) of the MARC
metadata of the record in question with references to the latest
versions of each file.
</p>
<p>Note that for $a and $x subfields filesystem paths must be absolute
(e.g. <code>/tmp/icon.gif</code> is valid,
while <code>Destkop/icon.gif</code> is not) and they must be readable
by the user/group of the bibupload process that will handle the FFT.
</p>
<p>The bibupload process supports the usual modes correct, append,
replace, insert with a semantic that is somewhat similar to the
semantic of the metadata upload:
<blockquote>
<table border="1">
<thead>
<tr>
<th></th><th>Metadata</th> <th>Fulltext</th>
</tr>
</thead>
<tbody>
<tr>
<td>objects being uploaded</td><td> MARC field instances characterized by tags (010-999) </td> <td> fulltext files characterized by unique file names (FFT $n)</td>
</tr>
<tr>
<td> insert </td><td> insert new record; must not exist </td><td> insert new files; must not exist </td>
</tr>
<tr>
<td> append </td><td> append new tag instances for the given tag XXX, regardless of existing tag instances</td><td> append new files, if filename (i.e. new format) not already present </td>
</tr>
<tr>
<td> correct </td><td> correct tag instances for the given tag XXX; delete existing ones and replace with given ones </td><td> correct files with the given filename; add new revision or delete file; if the docname does not exist the file is added</td>
</tr>
<tr>
<td> replace </td><td> replace all tags, whatever XXX are</td><td> replace all files, whatever filenames are </td>
</tr>
<tr>
<td> delete </td><td> delete all existing tag instances</td><td> not supported </td>
</tr>
</tbody>
</table>
</blockquote>
</p>
<p>Note, in append and insert mode, <pre>$m</pre> is ignored.
<p>In order to rename a document just use the the correct mode specifing in the
$n subfield the original docname that should be renamed and in $m the new name.
</p>
<p>Special values can be assigned to the $t subfield.</p>
<tr><td><tt>PURGE</tt></td><td>In order to purge previous file revisions (i.e. in order to keep only the latest file version), please use the correct mode with $n docname and $t PURGE as the special keyword.</td></tr>
<tr><td><tt>DELETE</tt></td><td>In order to delete all existing versions of a file, making it effectively hidden, please use the correct mode with $n docname
and $t DELETE as the special keyword.</td></tr>
<tr><td>EXPUNGE</td><td>In order to expunge (i.e. remove completely, also from the
filesystem) all existing versions of a file, making it effectively disappear, please use the correct mode with $n docname and $t EXPUNGE as the special keyword.
</td><tr>
<tr><td><tt>FIX-MARC</tt></td><td>In order to synchronize MARC to the bibrec/bibdoc structure (e.g. after an update or a tweak in the database), please use the correct mode with $n docname and $t FIX-MARC as the special keyword.</td></tr>
<tr><td><tt>FIX-ALL</tt></td><td>In order to fix a record (i.e. put all its linked documents in a coherent state) and synchronize the MARC to the table, please use the correct mode with $n docname and $t FIX-ALL as the special keyword.</td></tr>
<tr><td><tt>REVERT</tt></td><td>In order to revert to a previous file revision (i.e. to create a new revision with the same content as some previous revision had), please use the correct mode with $n docname, $t REVERT as the special
keyword and $v the number corresponding to the desired version.</td></tr>
<tr><td><tt>DELETE-FILE</tt></td><td>In order to delete a particular file added by mistake, please use the correct mode with $n docname, $t DELETE-FILE, specifing $v version and $f format. Note that this operation is not reversible. Note that
if you don't spcify a version, the last version will be used.</td></tr>
</tbody></table>
<p>In order to preserve previous comments and descriptions when
correcting, please use the KEEP-OLD-VALUE special keyword with the
desired $d and $z subfield.
</p>
<p>The $r subfield can contain a string that can be use to restrict
the given document. The same value must be specified for all the
format of a given document. By default the keyword will be used as the status
parameter for the "viewrestrdoc" action, which can be used to give
access right/restriction to desired user. e.g. if you set the keyword
"thesis", you can the connect the "thesisviewer" to the action
"viewrestrdoc" with parameter "status" set to "thesis". Then all the
user which are linked with the "thesisviewer" role will be able to
download the document. Instead any other user will not be
allowed. Note, if you use the keyword "KEEP-OLD-VALUE" the previous
restrictions if applicable will be kept.
</p>
<p>More advanced document-level restriction is indeed possible. If the value
contains infact:
<ul>
<li><tt><strong>email: </strong>john.doe@example.org</tt>: then only the user
having <tt>john.doe@example.org</tt> as email address will be authorized to
access the given document.</li>
<li><tt><strong>group: </strong>example</tt>: then only users belonging to
the local/external group <tt>example</tt> will be authorized to
access the given document.</li>
<li><tt><strong>role: </strong>example</tt>: then only the users
belonging to the WebAccess role <tt>example</tt> will be authorized to
access the given document.</li>
<li><tt><strong>firerole: </strong>allow .../deny...</tt>: then only the users
implicitly matched by the given
<a href="/help/admin/webaccess-admin-guide#6">firewall like role definition</a>
will be authorized to access the given document.</li>
<li><tt><strong>status: </strong>example</tt>: then only the users
belonging to roles having an authorization for the WebAccess action
<tt>viewrestrdoc</tt> with parameter <tt>status</tt> set to <tt>example</tt>
will be authorized (that is exactly like setting $r to <tt>example</tt>).</li>
</ul>
Note, that authors (as defined in the record MARC) and superadmin are always
authorized to access a document, no matter what is the given value of the status.
</p>
<p>Some special flags might be set via FFT and associated with the current
document by using the $o subfield. This feature is experimental. Currently only
two flags are actively considered:
<ul>
<li><strong>HIDDEN</strong>: used to specify that the file that is
currently added (via revision or append) must be hidden, i.e. must
not be visible to the world but only known by the system (e.g. to
allow for fulltext indexing). This flag is permanently associated
with the specific revision and format of the file being added.</li>
<li><strong>PERFORM_HIDE_PREVIOUS</strong>: used to specify that, although
the current file should be visible (unless the HIDDEN flag is also
specified), any other previous revision of the document should receive
the HIDDEN flag, and should thus be hidden to the world.</li>
</ul>
</p>
<p>Note that each time bibupload is called on a record, the 8564 tags
pointing to locally stored files are recreated on the basis of the
full-text files connected to the record. Thus, if you whish to update
some 8564 tag pointing to a locally managed file, the only way to
perform this is through the FFT tag, not by editing 8564 directly.<p>
<a name="4"></a><h2>4. Batch Uploader</h2>
<a name="4.1"></a><h3>4.1 Web interface - Cataloguers</h3>
<p>The batchuploader web interface can be used either to upload metadata files or documents. Opposed to daemon mode, actions will be executed only once.</p>
<p>The available upload history displays metadata and document uploads using the web interface, not daemon mode.</p>
<a name="4.2"></a><h3>4.2 Web interface - Robots</h3>
<p>If it is needed to use the batch upload function from within command line, this can be achieved with a curl call, like:
<blockquote>
<pre>
$ curl -F 'file=@localfile.xml' -F 'mode=-i' http://cdsweb.cern.ch/batchuploader/robotupload -A invenio_webupload
<pre>
</blockquote></p>
<p>This service provides (client, file) checking to assure the records are put into a collection the client has rights to.<br />
To configure this permissions, check <tt>CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS</tt> variable in the configuration file.<br />
The allowed user agents can also be defined using the <tt>CFG_BATCHUPLOADER_WEB_ROBOT_AGENT</tt> variable.</p>
<a name="4.3"></a><h3>4.2 Daemon mode</h3>
<p>The batchuploader daemon mode is intended to be a bibsched task for document or metadata upload. The parent directory where the daemon will look for folders <tt>metadata</tt> and <tt>documents</tt> must be specified in the invenio configuration file.</p>
<p>An example of how directories should be arranged, considering that invenio was installed in folder <tt>/opt/invenio</tt> would be:
<pre>
/opt/invenio/var/batchupload
/opt/invenio/var/batchupload/documents
/opt/invenio/var/batchupload/documents/append
/opt/invenio/var/batchupload/documents/revise
/opt/invenio/var/batchupload/metadata
/opt/invenio/var/batchupload/metadata/append
/opt/invenio/var/batchupload/metadata/correct
/opt/invenio/var/batchupload/metadata/insert
/opt/invenio/var/batchupload/metadata/replace
</pre>
</p>
<p>When running the batchuploader daemon there are two possible execution modes:
<pre>
-m, --metadata Look for metadata files in folders insert, append, correct and replace.
All files are uploaded and then moved to the corresponding DONE folder.
-d, --documents Look for documents in folders append and revise. Uploaded files are then
moved to DONE folders if possible.
</pre>
By default, metadata mode is used.</p>
<p>An example of invocation would be:
<blockquote>
<pre>
$ batchuploader --documents
<pre>
</blockquote></p>
It is possible to program batch uploader to run periodically. Read the <a href='howto-run'>Howto-run guide</a> to see how.