diff --git a/INSTALL b/INSTALL index e9cb698f1..53ae21aef 100644 --- a/INSTALL +++ b/INSTALL @@ -1,586 +1,594 @@ CDS Invenio INSTALLATION ======================== Revision: $Id$ About ===== This document specifies how to build, customize, and install CDS Invenio for the first time. See RELEASE-NOTES if you are upgrading from a previous CDS Invenio release. Contents ======== 0. Prerequisites 1. Quick instructions for the impatient CDS Invenio admin 2. Detailed instructions for the patient CDS Invenio admin 0. Prerequisites ================ Here is the software you need to have around before you start installing CDS Invenio: a) Unix-like operating system. The main development and production platform for CDS Invenio at CERN is Debian GNU/Linux, but we actively develop also on FreeBSD and Mac OS X. Basically any Unix system supporting the software listed below should do. Note that if you are using Debian "Sarge" GNU/Linux, you can install most of the below-mentioned prerequisites and recommendations by running: $ sudo apt-get install libapache2-mod-python2.3 \ apache2-mpm-prefork mysql-server-4.1 mysql-client-4.1 \ python2.3-mysqldb python2.3-numeric python2.3-4suite \ python2.3-xml python2.3-libxml2 python2.3-libxslt1 \ rxp wml gnuplot xpdf-utils gs-common antiword catdoc \ wv html2text ppthtml xlhtml clisp gettext You can also install the following packages: $ sudo apt-get install python2.3-psyco sbcl cmucl The last three packages are not available on all Debian "Sarge" GNU/Linux architectures (e.g. not on AMD64), but they are only recommended so you can safely continue without them. Note that the web application server should run a Message Transfer Agent (MTA) such as Postfix so that CDS Invenio can email notification alerts or registration information to the end users, contact moderators and reviewers of submitted documents, inform administrators about various runtime system information, etc. b) MySQL server (may be on a remote machine), and MySQL client (must be available locally too). MySQL versions 4.1 or 5.0 are recommended, because some parts of the CDS Invenio code use SQL features not present in MySQL 4.0. Please set the variable ``max_allowed_packet'' in your ``my.cnf'' init file to at least 4M. c) Apache 2 server, with support for loading DSO modules, and optionally with SSL support for HTTPS-secure user authentication. Tested mainly with version 2.0.43 and above. Apache 2.x is required for the mod_python module (see below). d) Python v2.3 or above: as well as the following Python modules: - - (mandatory) MySQLdb + - (mandatory) MySQLdb (do *not* use 1.2.1_p2; see below) - (mandatory) Numeric module (v21 and above): - (recommended) PyStemmer, for indexing and ranking: - (recommended) PyXML, for XML processing: - (recommended) PyRXP, for very fast XML MARC processing: - (recommended) libxml2-python, for XML/XLST processing: - (recommended) Gnuplot.Py, for producing graphs: - (optional) 4suite, slower alternative to PyRXP and libxml2-python: - (optional) Psyco, to speed up the code at places: - (optional) RDFLib, to use RDF ontologies and thesauri: - - (optional) mechanize, to run regression web tests: + - (optional) mechanize, to run regression web test suite: Note: If you happen to use MySQLdb 1.2.1_p2, please apply the following patch: . e) mod_python Apache module. Tested mainly with versions 3.0BETA4 and above. mod_python 3.x is required for Apache 2. Previous versions (as well as Apache 1 ones) exhibited some problems with MySQL connectivity in our experience. f) WML - Website META Language. Tested mainly with versions 2.0.8 and 2.0.9. Note that on Red Hat Linux 9 the WML 2.0.9 compiled with Perl 5.8.0 exhibits problems, so you better use downgraded/upgraded Perl for compiling WML on that platform. g) If you want to be able to extract references from PDF fulltext files, then you need to install pdftotext version 3 at least. h) If you want to be able to search for words in the fulltext files (i.e. to have fulltext indexing), then you need as well to install some of the following tools: - for PDF files: pdftotext or pstotext - for PostScript files: pstotext or ps2ascii - for MS Word files: antiword, catdoc, or wvText - for MS PowerPoint files: pptHtml and html2text - for MS Excel files: xlhtml and html2text i) If you have chosen to install fast XML MARC Python processors in the step d) above, then you have to install the parsers themselves: - (optional) RXP: - (optional) 4suite: j) (recommended) Gnuplot, the command-line driven interactive plotting program. It is used to display download and citation history graphs on the Detailed record pages on the web interface. Note that Gnuplot is not required, only recommended. k) (recommended) A Common Lisp implementation, such as CLISP, SBCL or CMUCL. It is used for the web server log analysing tool and the metadata checking program. Note that any of the three implementations CLISP, SBCL, or CMUCL will do. CMUCL produces fastest machine code, but it does not support UTF-8 yet. Pick up CLISP if you don't know what to do. Note that a Common Lisp implementation is not required, only recommended. l) GNU Gettext, a set of tools that makes it possible to translate the application in multiple languages. This is available by default on many systems. Note that the configure script checks whether you have all the prerequisite software installed and that it won't let you continue unless everything is in order. It also warns you if it cannot find some optional but recommended software. 1. Quick instructions for the impatient CDS Invenio admin ========================================================= $ cd /usr/local/src/ $ wget http://cdsware.cern.ch/download/cds-invenio-0.92.0.tar.gz $ wget http://cdsware.cern.ch/download/cds-invenio-0.92.0.tar.gz.md5 $ wget http://cdsware.cern.ch/download/cds-invenio-0.92.0.tar.gz.sig $ md5sum -v -c cds-invenio-0.92.0.tar.gz.md5 $ gpg --verify cds-invenio-0.92.0.tar.gz.sig cds-invenio-0.92.0.tar.gz $ tar xvfz cds-invenio-0.92.0.tar.gz $ cd cds-invenio-0.92.0 $ ./configure --prefix=/opt/cds-invenio \ --with-weburl=http://webserver.domain.com \ --with-sweburl=https://webserver.domain.com \ --with-dbhost=sqlserver.domain.com \ --with-dbname=cdsinvenio \ --with-dbuser=cdsinvenio \ --with-dbpass=my123pass \ --with-python=/opt/python/bin/python2.3 $ vi ./config/config.wml ## optional, but strongly recommended $ make - $ mysql -h sqlserver.domain.com -u root -p mysql - mysql> CREATE DATABASE cdsinvenio; - mysql> GRANT ALL PRIVILEGES ON cdsinvenio.* TO cdsinvenio@webserver.domain.com IDENTIFIED BY 'myp1ss'; + $ make install $ sudo vi /path/to/apache/conf/httpd.conf ## see below in part 2 $ sudo /path/to/apache/bin/apachectl graceful - $ make create-tables ## optional - $ sudo ln -s /opt/cds-invenio/lib/python/invenio \ - /usr/local/lib/python2.3/site-packages/invenio \ - ## optional - $ make install - $ make test ## optional - $ sudo chown -R www-data /opt/cds-invenio/var + $ sudo chgrp -R www-data /opt/cds-invenio + $ sudo chmod -R g+r /opt/cds-invenio + $ sudo chmod -R g+rw /opt/cds-invenio/var + $ sudo find /opt/cds-invenio -type d -exec chmod g+rxw {} \; $ make create-demo-site ## optional $ make load-demo-records ## optional + $ make test ## optional $ make regression-test ## optional $ make remove-demo-records ## optional $ make drop-demo-site ## optional $ firefox http://webserver.domain.com/admin/ ## optional 2. Detailed instructions for the patient CDS Invenio admin ========================================================== The CDS Invenio uses standard GNU autoconf method to build and install its files. This means that you proceed as follows: $ cd /usr/local/src/ Change to a directory where we will configure and build the CDS Invenio. (The built files will be installed into different "target" directories later.) $ wget http://cdsware.cern.ch/download/cds-invenio-0.92.0.tar.gz $ wget http://cdsware.cern.ch/download/cds-invenio-0.92.0.tar.gz.md5 $ wget http://cdsware.cern.ch/download/cds-invenio-0.92.0.tar.gz.sig Fetch CDS Invenio source tarball from the CDS Software Consortium distribution server, together with MD5 checksum and GnuPG cryptographic signature files useful for verifying the integrity of the tarball. $ md5sum -v -c cds-invenio-0.92.0.tar.gz.md5 Verify MD5 checksum. $ gpg --verify cds-invenio-0.92.0.tar.gz.sig cds-invenio-0.92.0.tar.gz Verify GnuPG cryptographic signature. Note that you may first have to import my public key into your keyring, if you haven't done that already: $ gpg --keyserver wwwkeys.pgp.net --recv-keys 0xBA5A2B67 The output of the gpg --verify command should then read: Good signature from "Tibor Simko " You can safely ignore any trusted signature certification warning that may follow after the signature has been successfully verified. $ tar xvfz cds-invenio-0.92.0.tar.gz Untar the distribution tarball. $ cd cds-invenio-0.92.0 Go to the source directory. $ ./configure --prefix=/opt/cds-invenio \ --with-weburl=http://webserver.domain.com \ --with-sweburl=https://webserver.domain.com \ --with-dbhost=sqlserver.domain.com \ --with-dbname=cdsinvenio \ --with-dbuser=cdsinvenio \ --with-dbpass=myp1ss \ --with-python=/opt/python/bin/python2.3 Configure essential CDS Invenio parameters, with the following signification: --prefix=/opt/cds-invenio CDS Invenio general installation directory, used to hold command-line binaries and program libraries containing the core CDS Invenio functionality, but also to store web pages, runtime log and cache information, etc. Several subdirs like `bin', `lib', and `var' will be created inside the --prefix directory to this effect. Note that the --prefix directory should be chosen outside of the Apache htdocs tree, since only one its subdirectory (prefix/var/www) is to be accessible directly via the Web (see below). --with-weburl=http://webserver.domain.com The URL denoting the home URL of your CDS Invenio installation. The files served by this URL will be located in `prefix/var/www', so later on in your Apache config file you would map `weburl' to `prefix/var/www' (see below). --with-sweburl=https://webserver.domain.com The URL denoting the HTTPS-secure equivalent of the home URL. The secure home URL will be used for personalization pages, such as user login and registration page. You must run SSL-enabled Apache in order to use this feature. If you don't run SSL-enabled Apache, then the user authentication will be done via standard HTTP protocol, user credentials travelling in clear text across the net. The --with-sweburl option is optional. --with-dbhost=sqlserver.domain.com --with-dbname=cdsinvenio --with-dbuser=cdsinvenio --with-dbpass=myp1ss The database server host, the database name, and the database user credentials. --with-python=/opt/python/bin/python2.3 Optionally, specify a path to some specific Python binary. This is useful if you have more than one Python installation on your system. If you don't set this option, then the first Python that will be found in your PATH will be chosen for running CDS Invenio. CDS Invenio won't install to any other directory but to the one mentioned in this configuration line. Do not use trailing slashes when specifying any of the above values. This configuration step is mandatory. Usually, you do this step only once. Note that if you prefer to build CDS Invenio out of its source tree, you may run the above configure command like this: (mkdir build && cd build && ../configure --prefix=...). $ vi ./config/config.wml ## optional, but strongly recommended Optionally, customize your CDS Invenio installation. We strongly recommend you to edit at least the top of this file where you can define some very essential CDS Invenio parameters like the name of your CDS Invenio document server (look for CDSNAME and CDSNAMEINTL) or the email address of the local CDS Invenio administrator (look for SUPPORTEMAIL and ADMINEMAIL). The latter is needed if you want to use administration modules, and you will certainly do. The rest of the "config.wml" file enables you to change the CDS Invenio web page look and feel, and otherwise to influence its behaviour and default parameters. CDS Invenio HTML pages will be pre-generated using the values set in the config.wml file. This configuration step is optional, but strongly recommended. If you change some values in config.wml, you should restart the installation from here. $ make Launch the CDS Invenio build. Since many messages are printed during the build process, you may want to run it in a fast-scrolling terminal such as rxvt or in a detached screen session. During this step all the pages and scripts will be pre-created and customized based on the config you have edited in the previous step. - Before proceeding further with the CDS Invenio installation, we - have to do some admin-level tasks on the MySQL and Apache - servers. - - $ mysql -h sqlserver.domain.com -u root -p mysql - mysql> CREATE DATABASE cdsinvenio; - mysql> GRANT ALL PRIVILEGES ON cdsinvenio.* TO cdsinvenio@webserver.domain.com IDENTIFIED BY 'myp1ss'; - - You need to create a dedicated database on your MySQL server - that the CDS Invenio can use for its purposes. Please - contact your MySQL administrator and ask him to execute the - above commands that will create the "cdsinvenio" database, a - user called "cdsinvenio" with password "myp1ss", and that - will grant all rights on the "cdsinvenio" database to the - "cdsinvenio" user. The credential values are the ones you - have chosen in the configure line above. + $ make install - $ sudo vi /path/to/apache/conf/httpd.conf ## see below in part 2 + Install the web pages, scripts, utilities and everything + needed for runtime into the respective directories, as + specified earlier by the configure command. + + Note that the "make install" step checks for potential + problems such as the database connection rights and may ask + you to perform some more administrative steps in case it + detects a problem. Notably, it may ask you to: + + a) Set up database access permissions, e.g.: + + $ mysql -h sqlserver.domain.com -u root -p mysql + mysql> CREATE DATABASE cdsinvenio; + mysql> GRANT ALL PRIVILEGES ON cdsinvenio.* \ + TO cdsinvenio@webserver.domain.com \ + IDENTIFIED BY 'myp1ss'; + + If you are installing CDS Invenio for the first time, + you have to create a dedicated database on your MySQL + server that the CDS Invenio can use for its purposes. + Please contact your MySQL administrator and ask him + to execute the above commands that create the + "cdsinvenio" database, a user called "cdsinvenio" + with password "myp1ss", and that will grant all + rights on the "cdsinvenio" database to the + "cdsinvenio" user. The credential values should be + the ones you have chosen during the configure command + executed before. + + Note that the "make install" process will hint you at + the exact command to use based on the values you have + used in the configure line. + + b) Set up database tables: + + $ make create-tables + + If you are installing CDS Invenio for the first time, + you have to create the database tables by launching + the above command. + + c) Set up symbolic Python "invenio" module link: + + $ sudo ln -s /opt/cds-invenio/lib/python/invenio \ + /usr/local/lib/python2.3/site-packages/invenio + + If you are installing CDS Invenio for the first time, + you have to create a symbolic link from Python's + site-packages directory that would indicate to Python + where to find CDS Invenio's Python files. + + Note that the exact symlink target location depends + on the --prefix location (prefix/lib/python/invenio) + and the exact symlink source location depends on the + Python version you are using. + + Note that the "make install" process will hint you at + the exact command to use based on the values you have + used in the configure line. + + At this point you should now have successfully completed the + "make install" process. We continue by setting up the + Apache web server. + + $ sudo vi /path/to/apache/conf/httpd.conf Please ask your webserver administrator to put the following line in your "httpd.conf" configuration file: AddDefaultCharset UTF-8 This is to ensure that the browsers will get UTF-8 as the default page encoding. As mentioned above, the web pages will get installed into the `prefix/var/www' directory. Therefore you should specify something along the lines of: ServerSignature Off ServerTokens Prod NameVirtualHost *:80 ServerName webserver.domain.com ServerAdmin cds.support@cern.ch DocumentRoot /opt/cds-invenio/var/www Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny allow from all ErrorLog /opt/cds-invenio/var/log/apache.err LogLevel warn CustomLog /opt/cds-invenio/var/log/apache.log combined DirectoryIndex index.en.html index.html SetHandler python-program PythonHandler invenio.webinterface_layout PythonDebug On AddHandler python-program .py PythonHandler mod_python.publisher PythonDebug On This will tell Apache where to find the files, how to interpret .py files, which files to serve as indexes, etc. If you have configured the system to use secure URL for login (see above), then you have to specify secure site too, such as: ServerSignature Off ServerTokens Prod NameVirtualHost *:443 SSLCertificateFile /etc/apache2/ssl/apache.pem ServerName webserver.domain.com ServerAdmin cds.support@cern.ch SSLEngine on DocumentRoot /opt/cds-invenio/var/www Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny allow from all ErrorLog /opt/cds-invenio/var/log/apache-ssl.err LogLevel warn CustomLog /opt/cds-invenio/var/log/apache-ssl.log combined DirectoryIndex index.en.html index.html SetHandler python-program PythonHandler invenio.webinterface_layout PythonDebug On AddHandler python-program .py PythonHandler mod_python.publisher PythonDebug On $ sudo /path/to/apache/bin/apachectl graceful Please ask your webserver administrator to restart the Apache server after the above "httpd.conf" changes. - After these admin-level tasks to be performed as root, let's - now go back to finish the installation of the CDS Invenio - package. - - $ make create-tables ## optional - - If you are installing for the first time, you have to create - CDS Invenio tables in the database. - - Note that the `make install' process will warn you in case - the tables were not created and will ask you to run this - step manually before completing the make install process. - - $ sudo ln -s /opt/cds-invenio/lib/python/invenio \ - /usr/local/lib/python2.3/site-packages/invenio \ - ## optional - - If you are installing for the first time, you will have to - create a symbolic link from Python's site-packages directory - that would indicate to Python where to find CDS Invenio's - Python files. + $ sudo chgrp -R www-data /opt/cds-invenio + $ sudo chmod -R g+r /opt/cds-invenio + $ sudo chmod -R g+rw /opt/cds-invenio/var + $ sudo find /opt/cds-invenio -type d -exec chmod g+rxw {} \; - Note that the exact symlink target location depends on the - --prefix location (prefix/lib/python/invenio) and the exact - symlink source location depends on the Python version you - are using. (See also --with-python configuration option.) + One more superuser step, because we need to enable Apache + server to read files from the installation place and to + write some log information and to cache interesting entities + inside the "var" subdirectory of our CDS Invenio + installation directory. - Note that the `make install' process will warn you in case - the symbolic link was not created and it will indicate you - the command to use to create it manually before completing - the make install process. - - $ make install - - Install the web pages, scripts, utilities and everything - needed for runtime into the respective directories, as - specified earlier by the configure command. - - After this step, you should be able to point your browser to - the chosen URL of your local CDS Invenio installation and see it - running! - - $ make test - - Optionally, you can run the unit test suite to verify the - unit behaviour of your local CDS Invenio installation. Note - that this command should be run only after you have - installed the whole system via `make install'. - - $ sudo chown -R www-data /opt/cds-invenio/var - - One more superuser step, as we need to enable Apache server - to write some log information and to cache interesting - entities inside the "var" subdirectory of our CDS Invenio - general installation directory. - - Here we assume that your Apache server processes are run + Here we assumed that your Apache server processes are run under "www-data" group. Change this appropriately for your system. + After these admin-level tasks to be performed as root, let's + now go back to finish the installation of the CDS Invenio. + $ make create-demo-site ## optional This step is recommended to test your local CDS Invenio installation. It should give you our "Atlantis Institute of Science" demo installation, exactly as you see it at - . + . $ make load-demo-records ## optional Optionally, load some demo records to be able to test - indexing and searching of your local demo CDS Invenio + indexing and searching of your local CDS Invenio demo installation. + $ make test ## optional + + Optionally, you can run the unit test suite to verify the + unit behaviour of your local CDS Invenio installation. Note + that this command should be run only after you have + installed the whole system via `make install'. + $ make regression-test ## optional Optionally, you can run the full regression test suite to verify the functional behaviour of your local CDS Invenio installation. Note that this command requires to have created the demo site and loaded the demo records. Note also that running the regression test suite may alter the database content with junk data, so that rebuilding the - demo site is stongly recommended afterwards. + demo site is strongly recommended afterwards. $ make remove-demo-records ## optional Optionally, remove the demo records loaded in the previous - step but otherwise keep the demo collection, submit, format - etc configurations that you may reuse and modify for - production purposes. + step, but keeping otherwise the demo collection, submission, + format, and other configurations that you may reuse and + modify for your own production purposes. $ make drop-demo-site ## optional Optionally, drop also all the demo configuration so that - you'll have a blank CDS Invenio system for your production - purposes. + you'll end up with a completely blank CDS Invenio system for + your own production purposes. $ firefox http://webserver.domain.com/admin/ ## optional Optionally, do further runtime configurations of CDS Invenio, like definition of data collections, document types, document formats, word file tables, etc. This configuration step is optional but you will most - probably want to define specific data collections to your - setup, or to configure submit pages for fill your + probably want to define specific data collections suited to + your setup, or to configure submit pages for fill your collections, to define new output formats, etc. This configuration step uses MySQL configuration tables and - can be done anytime during the production of your system, - unlike the configure-time or compile-time configurations - presented above. + can be done anytime during the production of your system. Good luck, and thanks for choosing CDS Invenio. - CDS Development Group