Page MenuHomec4science

INSTALL
No OneTemporary

File Metadata

Created
Sat, Jul 27, 03:03
CDS Invenio INSTALLATION
========================
Revision: $Id$
About
=====
This document specifies how to build, customize, and install CDS
Invenio for the first time. See RELEASE-NOTES if you are upgrading
from a previous CDS Invenio release.
Contents
========
0. Prerequisites
1. Quick instructions for the impatient CDS Invenio admin
2. Detailed instructions for the patient CDS Invenio admin
0. Prerequisites
================
Here is the software you need to have around before you
start installing CDS Invenio:
a) Unix-like operating system. The main development and
production platforms for CDS Invenio at CERN are GNU/Linux
distributions SLC (RHEL), Debian, and Gentoo, but we also
develop on FreeBSD and Mac OS X. Basically any Unix system
supporting the software listed below should do.
Note that if you are using Debian "Sarge" GNU/Linux, you can
install most of the below-mentioned prerequisites and
recommendations by running:
$ sudo apt-get install libapache2-mod-python2.3 \
apache2-mpm-prefork mysql-server-4.1 mysql-client-4.1 \
python2.3-mysqldb python2.3-4suite \
python2.3-xml python2.3-libxml2 python2.3-libxslt1 \
rxp gnuplot xpdf-utils gs-common antiword catdoc \
wv html2text ppthtml xlhtml clisp gettext
You can also install the following packages:
$ sudo apt-get install python2.3-psyco sbcl cmucl
The last three packages are not available on all Debian
"Sarge" GNU/Linux architectures (e.g. not on AMD64), but they
are only recommended so you can safely continue without them.
Note that you can consult CDS Invenio wiki pages at
<https://twiki.cern.ch/twiki/bin/view/CDS/Invenio> for more
system-specific notes.
Note that the web application server should run a Message
Transfer Agent (MTA) such as Postfix so that CDS Invenio can
email notification alerts or registration information to the
end users, contact moderators and reviewers of submitted
documents, inform administrators about various runtime system
information, etc.
b) MySQL server (may be on a remote machine), and MySQL client
(must be available locally too). MySQL versions 4.1 or 5.0
are supported. Please set the variable "max_allowed_packet"
in your "my.cnf" init file to at least 4M. You may also want
to run your MySQL server natively in UTF-8 mode by setting
"default-character-set=utf8" in various parts of your "my.cnf"
file, such as in the "[mysql]" part and elsewhere.
<http://mysql.com/>
c) Apache 2 server, with support for loading DSO modules, and
optionally with SSL support for HTTPS-secure user
authentication. Tested mainly with version 2.0.43 and above.
Apache 2.x is required for the mod_python module (see below).
<http://httpd.apache.org/>
d) Python v2.3 or above:
<http://python.org/>
as well as the following Python modules:
- (mandatory) MySQLdb (version >= 1.2.1_p2; see below)
<http://sourceforge.net/projects/mysql-python>
- (recommended) PyXML, for XML processing:
<http://pyxml.sourceforge.net/topics/download.html>
- (recommended) PyRXP, for very fast XML MARC processing:
<http://www.reportlab.org/pyrxp.html>
- (recommended) libxml2-python, for XML/XLST processing:
<ftp://xmlsoft.org/libxml2/python/>
- (recommended) Gnuplot.Py, for producing graphs:
<http://gnuplot-py.sourceforge.net/>
- (recommended) Snowball Stemmer, for stemming:
<http://snowball.tartarus.org/wrappers/PyStemmer-1.0.1.tar.gz>
- (optional) 4suite, slower alternative to PyRXP and
libxml2-python:
<http://4suite.org/>
- (optional) feedparser, for web journal creation:
<http://feedparser.org/>
- (optional) Psyco, to speed up the code at places:
<http://psyco.sourceforge.net/>
- (optional) RDFLib, to use RDF ontologies and thesauri:
<http://rdflib.net/>
- (optional) mechanize, to run regression web test suite:
<http://wwwsearch.sourceforge.net/mechanize/>
Note: MySQLdb version 1.2.1_p2 or higher is recommended. If
you are using an older version of MySQLdb, you may get
into problems with character encoding.
e) mod_python Apache module. Tested mainly with versions
3.0BETA4 and above. mod_python 3.x is required for Apache 2.
Previous versions (as well as Apache 1 ones) exhibited some
problems with MySQL connectivity in our experience.
<http://www.modpython.org/>
f) If you want to be able to extract references from PDF fulltext
files, then you need to install pdftotext version 3 at least.
<http://www.foolabs.com/xpdf/home.html>
g) If you want to be able to search for words in the fulltext
files (i.e. to have fulltext indexing) or to stamp submitted
files, then you need as well to install some of the following
tools:
- for PDF file stamping: pdftk, pdf2ps
<http://www.accesspdf.com/pdftk/>
<http://www.cs.wisc.edu/~ghost/doc/AFPL/>
- for PDF files: pdftotext or pstotext
<http://www.foolabs.com/xpdf/home.html>
<http://www.cs.wisc.edu/~ghost/doc/AFPL/>
- for PostScript files: pstotext or ps2ascii
<http://www.cs.wisc.edu/~ghost/doc/AFPL/>
- for MS Word files: antiword, catdoc, or wvText
<http://www.winfield.demon.nl/index.html>
<http://www.ice.ru/~vitus/catdoc/index.html>
<http://sourceforge.net/projects/wvware>
- for MS PowerPoint files: pptHtml and html2text
<http://packages.debian.org/stable/utils/ppthtml>
<http://userpage.fu-berlin.de/~mbayer/tools/html2text.html>
- for MS Excel files: xlhtml and html2text
<http://chicago.sourceforge.net/xlhtml/>
<http://userpage.fu-berlin.de/~mbayer/tools/html2text.html>
h) If you have chosen to install fast XML MARC Python processors
in the step d) above, then you have to install the parsers
themselves:
- (optional) RXP:
<http://www.cogsci.ed.ac.uk/~richard/rxp.html>
- (optional) 4suite:
<http://4suite.org/>
i) (recommended) Gnuplot, the command-line driven interactive
plotting program. It is used to display download and citation
history graphs on the Detailed record pages on the web
interface. Note that Gnuplot must be compiled with PNG output
support, that is, with the GD library. Note also that Gnuplot
is not required, only recommended.
<http://www.gnuplot.info/>
j) (recommended) A Common Lisp implementation, such as CLISP,
SBCL or CMUCL. It is used for the web server log analysing
tool and the metadata checking program. Note that any of the
three implementations CLISP, SBCL, or CMUCL will do. CMUCL
produces fastest machine code, but it does not support UTF-8
yet. Pick up CLISP if you don't know what to do. Note that a
Common Lisp implementation is not required, only recommended.
<http://clisp.cons.org/>
<http://www.cons.org/cmucl/>
<http://sbcl.sourceforge.net/>
k) GNU gettext, a set of tools that makes it possible to
translate the application in multiple languages.
<http://www.gnu.org/software/gettext/>
This is available by default on many systems.
Note that the configure script checks whether you have all the
prerequisite software installed and that it won't let you continue
unless everything is in order. It also warns you if it cannot find
some optional but recommended software.
1. Quick instructions for the impatient CDS Invenio admin
=========================================================
1a. Installation
----------------
$ cd /usr/local/src/
$ wget http://cdsware.cern.ch/download/cds-invenio-0.99.0.tar.gz
$ wget http://cdsware.cern.ch/download/cds-invenio-0.99.0.tar.gz.md5
$ wget http://cdsware.cern.ch/download/cds-invenio-0.99.0.tar.gz.sig
$ md5sum -v -c cds-invenio-0.99.0.tar.gz.md5
$ gpg --verify cds-invenio-0.99.0.tar.gz.sig cds-invenio-0.99.0.tar.gz
$ tar xvfz cds-invenio-0.99.0.tar.gz
$ cd cds-invenio-0.99.0
$ ./configure
$ make
$ make install
$ make install-jsmath-plugin ## optional
1b. Configuration
-----------------
$ emacs /opt/cds-invenio/etc/invenio.conf
$ emacs /opt/cds-invenio/etc/invenio-local.conf
$ /opt/cds-invenio/bin/inveniocfg --update-all
$ /opt/cds-invenio/bin/inveniocfg --create-tables
$ /opt/cds-invenio/bin/inveniocfg --create-apache-conf
$ sudo /path/to/apache/bin/apachectl graceful
$ sudo chgrp -R www-data /opt/cds-invenio
$ sudo chmod -R g+r /opt/cds-invenio
$ sudo chmod -R g+rw /opt/cds-invenio/var
$ sudo find /opt/cds-invenio -type d -exec chmod g+rxw {} \;
$ /opt/cds-invenio/bin/inveniocfg --create-demo-site
$ /opt/cds-invenio/bin/inveniocfg --load-demo-records
$ /opt/cds-invenio/bin/inveniocfg --run-unit-tests
$ /opt/cds-invenio/bin/inveniocfg --run-regression-tests
$ /opt/cds-invenio/bin/inveniocfg --run-web-tests
$ /opt/cds-invenio/bin/inveniocfg --remove-demo-records
$ /opt/cds-invenio/bin/inveniocfg --drop-demo-site
$ firefox http://your.site.com/help/admin/howto-run
2. Detailed instructions for the patient CDS Invenio admin
==========================================================
2a. Installation
----------------
The CDS Invenio uses standard GNU autoconf method to build and
install its files. This means that you proceed as follows:
$ cd /usr/local/src/
Change to a directory where we will configure and build the
CDS Invenio. (The built files will be installed into
different "target" directories later.)
$ wget http://cdsware.cern.ch/download/cds-invenio-0.99.0.tar.gz
$ wget http://cdsware.cern.ch/download/cds-invenio-0.99.0.tar.gz.md5
$ wget http://cdsware.cern.ch/download/cds-invenio-0.99.0.tar.gz.sig
Fetch CDS Invenio source tarball from the CDS Software
Consortium distribution server, together with MD5 checksum
and GnuPG cryptographic signature files useful for verifying
the integrity of the tarball.
$ md5sum -v -c cds-invenio-0.99.0.tar.gz.md5
Verify MD5 checksum.
$ gpg --verify cds-invenio-0.99.0.tar.gz.sig cds-invenio-0.99.0.tar.gz
Verify GnuPG cryptographic signature. Note that you may
first have to import my public key into your keyring, if you
haven't done that already:
$ gpg --keyserver wwwkeys.eu.pgp.net --recv-keys 0xBA5A2B67
The output of the gpg --verify command should then read:
Good signature from "Tibor Simko <tibor@simko.info>"
You can safely ignore any trusted signature certification
warning that may follow after the signature has been
successfully verified.
$ tar xvfz cds-invenio-0.99.0.tar.gz
Untar the distribution tarball.
$ cd cds-invenio-0.99.0
Go to the source directory.
$ ./configure
Configure CDS Invenio software for building on this specific
platform. You can use the following optional parameters:
--prefix=/opt/cds-invenio
Optionally, specify the CDS Invenio general
installation directory (default is /opt/cds-invenio).
It will contain command-line binaries and program
libraries containing the core CDS Invenio
functionality, but also store web pages, runtime log
and cache information, document data files, etc.
Several subdirs like `bin', `etc', `lib', or `var'
will be created inside the prefix directory to this
effect. Note that the prefix directory should be
chosen outside of the Apache htdocs tree, since only
one its subdirectory (prefix/var/www) is to be
accessible directly via the Web (see below).
Note that CDS Invenio won't install to any other
directory but to the prefix mentioned in this
configuration line.
--with-python=/opt/python/bin/python2.3
Optionally, specify a path to some specific Python
binary. This is useful if you have more than one
Python installation on your system. If you don't set
this option, then the first Python that will be found
in your PATH will be chosen for running CDS Invenio.
--with-mysql=/opt/mysql/bin/mysql
Optionally, specify a path to some specific MySQL
client binary. This is useful if you have more than
one MySQL installation on your system. If you don't
set this option, then the first MySQL client
executable that will be found in your PATH will be
chosen for running CDS Invenio.
--with-clisp=/opt/clisp/bin/clisp
Optionally, specify a path to CLISP executable. This
is useful if you have more than one CLISP
installation on your system. If you don't set this
option, then the first executable that will be found
in your PATH will be chosen for running CDS Invenio.
--with-cmucl=/opt/cmucl/bin/lisp
Optionally, specify a path to CMUCL executable. This
is useful if you have more than one CMUCL
installation on your system. If you don't set this
option, then the first executable that will be found
in your PATH will be chosen for running CDS Invenio.
--with-sbcl=/opt/sbcl/bin/sbcl
Optionally, specify a path to SBCL executable. This
is useful if you have more than one SBCL
installation on your system. If you don't set this
option, then the first executable that will be found
in your PATH will be chosen for running CDS Invenio.
This configuration step is mandatory. Usually, you do this
step only once.
(Note that if you prefer to build CDS Invenio out of its
source tree, you may run the above configure command like
this: mkdir build && cd build && ../configure --prefix=...
FIXME: this is not working right now as per the introduction
of intbitset_setup.py.)
$ make
Launch the CDS Invenio build. Since many messages are printed
during the build process, you may want to run it in a
fast-scrolling terminal such as rxvt or in a detached screen
session.
During this step all the pages and scripts will be
pre-created and customized based on the config you have
edited in the previous step.
Note that on systems such as FreeBSD or Mac OS X you have to
use GNU make ("gmake") instead of "make".
$ make install
Install the web pages, scripts, utilities and everything
needed for runtime into the respective directories, as
specified earlier by the configure command.
Note that if you are installing CDS Invenio for the first
time, you will be asked to create a symbolic link for the
"invenio" Python module from Python's site-packages
directory to instruct Python where to find CDS Invenio's
Python files. The process will hint you at the exact
command to use based on the values you have used in the
configure line.
(Note also that on some operating systems you might need to
create another symlink manually for lib64:
$ sudo ln -s /opt/cds-invenio/lib/python/invenio \
/usr/local/lib64/python2.3/site-packages/invenio
if you happen to encounter some troubles finding intbitset
libraries.)
$ sudo make install-jsmath-plugin ## optional
This will automatically download and install in the proper
place jsMath, a Javascript library to render LaTeX formulas
in the client browser.
Note that in order to enable the rendering you will have to
set later the variable CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS
in the invenio.conf to a suitable list of output format
codes like in "['hd', 'hb']".
2b. Configuration
-----------------
Once the basic software installation is done, we proceed to
configuring your Invenio system.
$ emacs /opt/cds-invenio/etc/invenio.conf
$ emacs /opt/cds-invenio/etc/invenio-local.conf
Customize your CDS Invenio installation. The 'invenio.conf'
file contains the vanilla default configuration parameters
of a CDS Invenio installation, as coming from the
distribution. You could in principle go ahead and change
the values according to your local needs.
However, you can also create a file named
'invenio-local.conf' in the same directory where
'invenio.conf' lives and put there only the localizations
you need to have different from the default ones. For
example:
$ cat /opt/cds-invenio/etc/invenio-local.conf
[Invenio]
CFG_SITE_URL = http://your.site.com
CFG_SITE_SECURE_URL = https://your.site.com
CFG_SITE_ADMIN_EMAIL = john.doe@your.site.com
CFG_SITE_SUPPORT_EMAIL = john.doe@your.site.com
The Invenio system will then read both the default
invenio.conf file and your customized invenio-local.conf
file and it will override any default options with the ones
you have set in your local file. This cascading of
configuration parameters will ease you future upgrades.
You should override at least the parameters from the top of
invenio.conf file in order to define some very essential
runtime parameters such as the visible URL of your document
server (look for CFG_SITE_URL and CFG_SITE_SECURE_URL), the
database credentials (look for CFG_DATABASE_*), the name of
your document server (look for CFG_SITE_NAME and
CFG_SITE_NAME_INTL_*), or the email address of the local CDS
Invenio administrator (look for CFG_SITE_SUPPORT_EMAIL and
CFG_SITE_ADMIN_EMAIL).
$ /opt/cds-invenio/bin/inveniocfg --update-all
Make the rest of the Invenio system aware of your
invenio.conf changes. This step is mandatory each time you
edit your conf files.
$ /opt/cds-invenio/bin/inveniocfg --create-tables
If you are installing CDS Invenio for the first time, you
have to create database tables.
Note that this step checks for potential problems such as
the database connection rights and may ask you to perform
some more administrative steps in case it detects a problem.
Notably, it may ask you to set up database access
permissions, based on your configure values.
If you are installing CDS Invenio for the first time, you
have to create a dedicated database on your MySQL server
that the CDS Invenio can use for its purposes. Please
contact your MySQL administrator and ask him to execute the
commands this step proposes you.
At this point you should now have successfully completed the
"make install" process. We continue by setting up the
Apache web server.
$ /opt/cds-invenio/bin/inveniocfg --create-apache-conf
Running this command will generate Apache virtual host
configurations matching your installation. You will be
instructed to check created files (usually they are located
under /opt/cds-invenio/etc/apache/) and edit your httpd.conf
to put the following include statements:
Include /opt/cds-invenio/etc/apache/invenio-apache-vhost.conf
Include /opt/cds-invenio/etc/apache/invenio-apache-vhost-ssl.conf
$ sudo /path/to/apache/bin/apachectl graceful
Please ask your webserver administrator to restart the
Apache server after the above "httpd.conf" changes.
$ sudo chgrp -R www-data /opt/cds-invenio
$ sudo chmod -R g+r /opt/cds-invenio
$ sudo chmod -R g+rw /opt/cds-invenio/var
$ sudo find /opt/cds-invenio -type d -exec chmod g+rxw {} \;
One more superuser step, because we need to enable Apache
server to read files from the installation place and to
write some log information and to cache interesting entities
inside the "var" subdirectory of our CDS Invenio
installation directory.
Here we assumed that your Apache server processes are run
under "www-data" group. Change this appropriately for your
system.
Moreover, note that if you are using SELinux extensions
(e.g. on Fedora Core 6), you may have to check and enable
the write access of Apache user there too.
After these admin-level tasks to be performed as root, let's
now go back to finish the installation of the CDS Invenio.
$ /opt/cds-invenio/bin/inveniocfg --create-demo-site
This step is recommended to test your local CDS Invenio
installation. It should give you our "Atlantis Institute of
Science" demo installation, exactly as you see it at
<http://cdsware.cern.ch:8000/>.
$ /opt/cds-invenio/bin/inveniocfg --load-demo-records
Optionally, load some demo records to be able to test
indexing and searching of your local CDS Invenio demo
installation.
$ /opt/cds-invenio/bin/inveniocfg --run-unit-tests
Optionally, you can run the unit test suite to verify the
unit behaviour of your local CDS Invenio installation. Note
that this command should be run only after you have
installed the whole system via `make install'.
$ /opt/cds-invenio/bin/inveniocfg --run-regression-tests
Optionally, you can run the full regression test suite to
verify the functional behaviour of your local CDS Invenio
installation. Note that this command requires to have
created the demo site and loaded the demo records. Note
also that running the regression test suite may alter the
database content with junk data, so that rebuilding the
demo site is strongly recommended afterwards.
$ /opt/cds-invenio/bin/inveniocfg --run-web-tests
Optionally, you can run additional automated web tests
running in a real browser. This requires to have Firefox
with the Selenium IDE extension installed.
<http://en.www.mozilla.com/en/firefox/>
<http://selenium-ide.openqa.org/>
$ /opt/cds-invenio/bin/inveniocfg --remove-demo-records
Optionally, remove the demo records loaded in the previous
step, but keeping otherwise the demo collection, submission,
format, and other configurations that you may reuse and
modify for your own production purposes.
$ /opt/cds-invenio/bin/inveniocfg --drop-demo-site
Optionally, drop also all the demo configuration so that
you'll end up with a completely blank CDS Invenio system.
However, you may want to find it more practical not to drop
the demo site configuration but to start customizing from
there.
$ firefox http://your.site.com/help/admin/howto-run
In order to start using your CDS Invenio installation, you
can start indexing, formatting and other daemons as
indicated in the "HOWTO Run" guide on the above URL. You
can also use the Admin Area web interfaces to perform
further runtime configurations such as the definition of
data collections, document types, document formats, word
indexes, etc.
Good luck, and thanks for choosing CDS Invenio.
- CDS Development Group
<cds.support@cern.ch>
<http://cdsware.cern.ch/>

Event Timeline