diff --git a/modules/webhelp/web/admin/howto/index.html.wml b/modules/webhelp/web/admin/howto/index.html.wml index 80fbe5cb8..2ae444616 100644 --- a/modules/webhelp/web/admin/howto/index.html.wml +++ b/modules/webhelp/web/admin/howto/index.html.wml @@ -1,52 +1,52 @@ ## $Id$ ## This file is part of the CERN Document Server Software (CDSware). ## Copyright (C) 2002 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. #include "cdspage.wml" \ title="Admin HOWTOs" \ - navbar_name="howto" \ + navbar_name="admin" \ navtrail_previous_links="/admin/>Admin Area" \ navbar_select="howto"
The HOWTO guides will give you both short and not-so-short recipes and thoughts on some of the most frequently encountered administrative tasks.
- HOWTO MARC
- Describes how to choose the MARC representation of your metadata and how it will be stored in CDSware.
- HOWTO Migrate
- Describes how to migrate a bunch of your old data from any format you might have into CDSware.
- HOWTO Run
- Describes how to run your CDSware installation and how to take care of its normal operation day by day.
Haven't found what you were looking for? Suggest a HOWTO. diff --git a/modules/webhelp/web/admin/howto/marc.html.wml b/modules/webhelp/web/admin/howto/marc.html.wml index dad24038b..d86a62668 100644 --- a/modules/webhelp/web/admin/howto/marc.html.wml +++ b/modules/webhelp/web/admin/howto/marc.html.wml @@ -1,168 +1,168 @@ ## $Id$ ## This file is part of the CERN Document Server Software (CDSware). ## Copyright (C) 2002 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. #include "cdspage.wml" \ title="HOWTO MARC Your Bibliographic Data" \ - navbar_name="howto" \ + navbar_name="admin" \ navtrail_previous_links="/admin/>Admin Area > /admin/howto/>Admin HOWTOs" \ navbar_select="howto_marc"
All the bibliographic data in the CDSware system are internally represented in the MARC 21 format. There are several good reasons for this:
Basically, you are in one of the following three situations:
You do not care much about MARC or internal CDSware structure as far as you can work with "more meaningful" metadata concepts like author, abstract, title, etc. In this case we simply recommend you to stick to CDSware defaults that pre-set for you the most commonly used metadata fields (in alphabetical order; non-exhaustive list):
METADATA CONCEPT PROPOSED MARC 21 REPRESENTATION ------------------------ ------------------------------- Abstract 520 $a Author, first 100 $a Author(s), additional 700 $a Collection identifier 980 $a Email 8560 $f Imprint 260 $a,b,c; 300 $a Keywords 6531 $a Language 041 $a OAI identifier 909CO $o Publication info 909C4 $* [many subfields] References 999C5 $* [many subfields] Primary report number 037 $a [unique throughout the system!] Additional report number(s) 088 $a Series 490 $a,v Subject 65017 $a Title 245 $a URL (e.g. to fulltext) 8564 $u, $z
The advantage of using these CDSware defaults is that you can use pre-defined configurations of BibConvert, BibFormat, and BibWords.
You do not care much about MARC or internal CDSware structure,
so you are using CDSware defaults, but you need to introduce a new
metadata concept. For example, you would like to store also the
document shelf number, and you want to make it separately searchable.
In this case you are free to choose a MARC tag of your own, for
example 963az $6
. After that you would configure CDSware
as follows:
963az $6
MARC tag;
which should give you the functionality you need.
You have some constraints on the MARC level, for example you would like to use MARC markup scheme of your own. You are free to define your own scheme and even invert the meaning of our default configurations. For each field you would simply follow the above-mentioned configuration procedure.
However, when designing your own MARC scheme, you need to think of two CDSware-related restrictions:
100
should not mean first author in the Preprints collection and
title in the Videotapes collection. The MARC tags are
considered to be chosen globally, in a collection-independent way.
This means that we cannot have several collections reusing the same
MARC code for their own different purposes. (This should never happen
in well designed database system anyway, but if you have a merge of
various databases coming from various groups of users, and if you do
not have the liberty to remap their MARC tags, this may be a problem.)
100
with the value
$a Foo $a Bar $a Baz
. Then, the question "what is the
second $a
of the tag 100
?" is invalid within
the CDSware MARC paradigm. The CDSware would store a tag like that,
but not the order of repetitive subfields themselves. In our MARC
paradigm, we prefer to code that information either (i) into different
subfields within the same field instance (100 $a Foo $b Bar $c
Baz
), or (ii) into the same subfield but inside several field
instances (100 $a Foo
; 100 $a Bar
; 100
$a Baz
), according to what is more appropriate. (We think that
to rely on the order of repetitive subfields inside the same field
instance is a suspicious database design.)
These two restrictions were introduced in order to keep CDSware bibliographic tables both simple and fast. As explained above, we believe that any good database design will avoid these techniques anyway.
The CDSware tools come embedded with several independent flexible modules that enable you to easily convert your data from any format and to upload them into the CDSware system.
This document briefly describes how you proceed. To learn more on each of the steps, simply follow embedded links.
$ cd /tmp $ cp /my/own/doc/system/datadump.txt . $ vi dump.cfg $ bibconvert -cdatadump.cfg < datadump.txt > datadump.xml $ bibformat < datadump.xml > datadump_with_fmt.xml $ bibupload datadump_with_fmt.xml $ bibwords add 1-1500
diff --git a/modules/webhelp/web/admin/howto/run.html.wml b/modules/webhelp/web/admin/howto/run.html.wml index de0a840de..d4638dbf5 100644 --- a/modules/webhelp/web/admin/howto/run.html.wml +++ b/modules/webhelp/web/admin/howto/run.html.wml @@ -1,131 +1,131 @@ ## $Id$ ## This file is part of the CERN Document Server Software (CDSware). ## Copyright (C) 2002 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. #include "cdspage.wml" \ title="HOWTO Run Your CDSware Installation" \ - navbar_name="howto" \ + navbar_name="admin" \ navtrail_previous_links="/admin/>Admin Area > /admin/howto/>Admin HOWTOs" \ navbar_select="howto_run"
$ cd /tmp
Go to a temporary directory.$ cp /my/own/doc/system/datadump.txt .
Copy your old data into a text file or some other format of your choice. Preferably, the data should be well structured. (Anyhow, even a free text format may be attepmted to be matched!)$ vi dump.cfg
Describe the format of your data in the BibConvert language. This will enable you to transform them into XML MARC format that the CDSware system internally uses for bibliographic data handling. (If you have not chosen yet your MARC scheme, please read the MARC HOWTO.)$ bibconvert -cdatadump.cfg < datadump.txt > datadump.xml
Convert the data from your own format into XML MARC, using the configuration you just wrote in the previous step.$ bibformat < datadump.xml > datadump_with_fmt.xml
Create default HTML presentation of the data by using BibFormat configurations. (You may want to create new format in case your metadata structure is different from what you are used to.) This step enriches the XML MARC file by default HTML formats that the search engine will reuse.$ bibupload datadump_with_fmt.xml
Upload thusly enriched XML file into CDSware bibliographic databases.$ bibwords add 1-1500
Run the BibWords procedure on records #1 to #1500 to create word indexes of the newly inputted data. (Assuming there were 1500 records.)Congratulations! At this point you should have successfully migrated your old data into the CDSware system.
This HOWTO guide intends to give you ideas on how to run your CDSware installation and how to take care of its normal operation day by day.
Many tasks that manipulate the bibliographic record database can be set to run in a periodical mode. For example, we want to have the indexing engine to scan periodically for newly arrived documents to index them as soon as they enter into the system. It is the role of the BibSched system to take care of the task scheduling and the task execution.
Periodical tasks (such as regular metadata indexing) as well as one-time tasks (such as a batch upload of acquired metadata file) are not executed straight away but are stored in the BibSched task queue. BibSched daemon looks periodically in the queue and launches the tasks according to their order or the date of programmed runtime. You can consider BibSched to be a kind of cron daemon for bibliographic tasks.
This means that after CDSware installation you want to have BibSched daemon running permanently. To launch BibSched daemon, do:
To setup indexing, reformatting, and collection updating daemons to run periodically with a sleeping period of, say, 1 hour:$ bibsched -d
HINT: It is good to have these three tasks permanently in your BibSched queue so that your newly submitted documents will be further processed automatically.$ bibindex -s1h $ bibreformat -oHB,HD -s1h $ webcoll -s1h
Note that the BibSched daemon automatic mode stops as soon as some of the tasks ends with an error. It it therefore a good idea to inspect BibSched queue from time to time. This can be done by running the BibSched command-line admin interface:
that will permit you to stop/start the daemon mode, to delete the tasks already submitted, to run some of the tasks manually, etc. Note also that BibSched daemon writes log and error files on its operation and on the operation of its tasks. The log and error files can be found on your system at$ bibsched
HINT: You may want to launch the
bibsched
command from time to time (say a couple of times
per day) to inspect the BibSched queue and to verify the status of the
BibSched system.
Guest users create a log of entries in FIXME. In the future the garbage collection may be done via BibSched task
queue.
FIXME. BibReformat temorary XML files.
FIXME. WebSubmit archives, what to keep, what not.
HINT: You may want to launch this command every day.
$ sessiongc
Alert Engine
HINT: You may want to set up an external cron job
to call
$ alertengine
alertengine
each day.
Cleaning Up the Filesystem