diff --git a/INSTALL b/INSTALL index ba8613ed4..1b6700bd1 100644 --- a/INSTALL +++ b/INSTALL @@ -1,847 +1,842 @@ Invenio INSTALLATION ==================== About ===== This document specifies how to build, customize, and install Invenio v1.1.2 for the first time. See RELEASE-NOTES if you are upgrading from a previous Invenio release. Contents ======== 0. Prerequisites 1. Quick instructions for the impatient Invenio admin 2. Detailed instructions for the patient Invenio admin 0. Prerequisites ================ Here is the software you need to have around before you start installing Invenio: a) Unix-like operating system. The main development and production platforms for Invenio at CERN are GNU/Linux distributions Debian, Gentoo, Scientific Linux (aka RHEL), Ubuntu, but we also develop on Mac OS X. Basically any Unix system supporting the software listed below should do. If you are using Debian GNU/Linux ``Lenny'' or later, then you can install most of the below-mentioned prerequisites and recommendations by running: $ sudo aptitude install python-dev apache2-mpm-prefork \ mysql-server mysql-client python-mysqldb \ python-4suite-xml python-simplejson python-xml \ python-libxml2 python-libxslt1 gnuplot poppler-utils \ gs-common clisp gettext libapache2-mod-wsgi unzip \ python-dateutil python-rdflib python-pyparsing \ python-gnuplot python-magic pdftk html2text giflib-tools \ pstotext netpbm python-pypdf python-chardet python-lxml \ python-unidecode You may also want to install some of the following packages, if you have them available on your concrete architecture: $ sudo aptitude install sbcl cmucl pylint pychecker pyflakes \ python-profiler python-epydoc libapache2-mod-xsendfile \ openoffice.org python-utidylib python-beautifulsoup + (Note that if you use pip to manage your Python dependencies + instead of operating system packages, please see the section + (d) below on how to use pip instead of aptitude.) + Moreover, you should install some Message Transfer Agent (MTA) such as Postfix so that Invenio can email notification alerts or registration information to the end users, contact moderators and reviewers of submitted documents, inform administrators about various runtime system information, etc: $ sudo aptitude install postfix After running the above-quoted aptitude command(s), you can proceed to configuring your MySQL server instance (max_allowed_packet in my.cnf, see item 0b below) and then to installing the Invenio software package in the section 1 below. If you are using another operating system, then please continue reading the rest of this prerequisites section, and please consult our wiki pages for any concrete hints for your specific operating system. b) MySQL server (may be on a remote machine), and MySQL client (must be available locally too). MySQL versions 4.1 or 5.0 are supported. Please set the variable "max_allowed_packet" in your "my.cnf" init file to at least 4M. (For sites such as INSPIRE, having 1M records with 10M citer-citee pairs in its citation map, you may need to increase max_allowed_packet to 1G.) You may perhaps also want to run your MySQL server natively in UTF-8 mode by setting "default-character-set=utf8" in various parts of your "my.cnf" file, such as in the "[mysql]" part and elsewhere; but this is not really required. c) Apache 2 server, with support for loading DSO modules, and optionally with SSL support for HTTPS-secure user authentication, and mod_xsendfile for off-loading file downloads away from Invenio processes to Apache. - d) Python v2.4 or above: + d) Python v2.6 or above: as well as the following Python modules: - (mandatory) MySQLdb (version >= 1.2.1_p2; see below) - (mandatory) Pyparsing, for document parsing - (recommended) python-dateutil, for complex date processing: - (recommended) PyXML, for XML processing: - (recommended) PyRXP, for very fast XML MARC processing: - (recommended) lxml, for XML/XLST processing: - (recommended) libxml2-python, for XML/XLST processing: - - (recommended) simplejson, for AJAX apps: - - Note that if you are using Python-2.6, you don't need to - install simplejson, because the module is already included - in the main Python distribution. - (recommended) Gnuplot.Py, for producing graphs: - (recommended) Snowball Stemmer, for stemming: - (recommended) py-editdist, for record merging: - (recommended) numpy, for citerank methods: - (recommended) magic, for full-text file handling: - (optional) chardet, for character encoding detection: - (optional) 4suite, slower alternative to PyRXP and libxml2-python: - (optional) feedparser, for web journal creation: - (optional) RDFLib, to use RDF ontologies and thesauri: - (optional) mechanize, to run regression web test suite: - (optional) python-mock, mocking library for the test suite: - - (optional) hashlib, needed only for Python-2.4 and only - if you would like to use AWS connectivity: - - (optional) utidylib, for HTML washing: - (optional) Beautiful Soup, for HTML washing: - (optional) Python Twitter (and its dependencies) if you want to use the Twitter Fetcher bibtasklet: - (optional) Python OpenID if you want to enable OpenID support for authentication: - (optional) Python Rauth if you want to enable OAuth 1.0/2.0 support for authentication (depends on Python-2.6 or later): - (optional) unidecode, for ASCII representation of Unicode text: - Note: MySQLdb version 1.2.1_p2 or higher is recommended. If - you are using an older version of MySQLdb, you may get - into problems with character encoding. + Note that if you are using pip to install and manage your + Python dependencies, then you can run: + + $ sudo pip install -r requirements.txt + $ sudo pip install -r requirements-extras.txt + + to install all manadatory, recommended, and optional packages + mentioned above. e) mod_wsgi Apache module. Versions 3.x and above are recommended. - Note: if you are using Python 2.4 or earlier, then you should - also install the wsgiref Python module, available from: - (As of Python 2.5 - this module is included in standard Python - distribution.) - f) If you want to be able to extract references from PDF fulltext files, then you need to install pdftotext version 3 at least. g) If you want to be able to search for words in the fulltext files (i.e. to have fulltext indexing) or to stamp submitted files, then you need as well to install some of the following tools: - for Microsoft Office/OpenOffice.org document conversion: OpenOffice.org - for PDF file stamping: pdftk, pdf2ps - for PDF files: pdftotext or pstotext - for PostScript files: pstotext or ps2ascii - for DjVu creation, elaboration: DjVuLibre - to perform OCR: OCRopus (tested only with release 0.3.1) - to perform different image elaborations: ImageMagick - to generate PDF after OCR: netpbm, ReportLab and pyPdf or pyPdf2 h) If you have chosen to install fast XML MARC Python processors in the step d) above, then you have to install the parsers themselves: - (optional) 4suite: i) (recommended) Gnuplot, the command-line driven interactive plotting program. It is used to display download and citation history graphs on the Detailed record pages on the web interface. Note that Gnuplot must be compiled with PNG output support, that is, with the GD library. Note also that Gnuplot is not required, only recommended. j) (recommended) A Common Lisp implementation, such as CLISP, SBCL or CMUCL. It is used for the web server log analysing tool and the metadata checking program. Note that any of the three implementations CLISP, SBCL, or CMUCL will do. CMUCL produces fastest machine code, but it does not support UTF-8 yet. Pick up CLISP if you don't know what to do. Note that a Common Lisp implementation is not required, only recommended. k) GNU gettext, a set of tools that makes it possible to translate the application in multiple languages. This is available by default on many systems. l) (recommended) xlwt 0.7.2, Library to create spreadsheet files compatible with MS Excel 97/2000/XP/2003 XLS files, on any platform, with Python 2.3 to 2.6 m) (recommended) matplotlib 1.0.0 is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB® or Mathematica®), web application servers, and six graphical user interface toolkits. It is used to generate pie graphs in the custom summary query (WebStat) n) (optional) FFmpeg, an open-source tools an libraries collection to convert video and audio files. It makes use of both internal as well as external libraries to generate videos for the web, such as Theora, WebM and H.264 out of almost any thinkable video input. FFmpeg is needed to run video related modules and submission workflows in Invenio. The minimal configuration of ffmpeg for the Invenio demo site requires a number of external libraries. It is highly recommended to remove all installed versions and packages that are comming with various Linux distributions and install the latest versions from sources. Additionally, you will need the Mediainfo Library for multimedia metadata handling. Minimum libraries for the demo site: - the ffmpeg multimedia encoder tools - a library for jpeg images needed for thumbnail extraction - a library for the ogg container format, needed for Vorbis and Theora - the OGG Vorbis audi codec library - the OGG Theora video codec library - the WebM video codec library - the mediainfo library for multimedia metadata Recommended for H.264 video (!be aware of licensing issues!): - a library for H.264 video encoding - a library for Advanced Audi Coding - a library for MP3 encoding Note that the configure script checks whether you have all the prerequisite software installed and that it won't let you continue unless everything is in order. It also warns you if it cannot find some optional but recommended software. 1. Quick instructions for the impatient Invenio admin ========================================================= 1a. Installation ---------------- $ cd $HOME/src/ $ wget http://invenio-software.org/download/invenio-1.1.2.tar.gz $ wget http://invenio-software.org/download/invenio-1.1.2.tar.gz.md5 $ wget http://invenio-software.org/download/invenio-1.1.2.tar.gz.sig $ md5sum -c invenio-1.1.2.tar.gz.md5 $ gpg --verify invenio-1.1.2.tar.gz.sig invenio-1.1.2.tar.gz $ tar xvfz invenio-1.1.2.tar.gz $ cd invenio-1.1.2 $ ./configure $ make $ make install $ make install-mathjax-plugin ## optional $ make install-jquery-plugins ## optional $ make install-ckeditor-plugin ## optional $ make install-pdfa-helper-files ## optional $ make install-mediaelement ## optional $ make install-solrutils ## optional $ make install-js-test-driver ## optional 1b. Configuration ----------------- $ sudo chown -R www-data.www-data /opt/invenio $ sudo -u www-data emacs /opt/invenio/etc/invenio-local.conf $ sudo -u www-data /opt/invenio/bin/inveniocfg --update-all $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-tables $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-bibfield-conf $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-webstat-conf $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-apache-conf $ sudo /etc/init.d/apache2 restart $ sudo -u www-data /opt/invenio/bin/inveniocfg --check-openoffice $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-demo-site $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-demo-records $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-unit-tests $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-regression-tests $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-web-tests $ sudo -u www-data /opt/invenio/bin/inveniocfg --remove-demo-records $ sudo -u www-data /opt/invenio/bin/inveniocfg --drop-demo-site $ firefox http://your.site.com/help/admin/howto-run 2. Detailed instructions for the patient Invenio admin ========================================================== 2a. Installation ---------------- The Invenio uses standard GNU autoconf method to build and install its files. This means that you proceed as follows: $ cd $HOME/src/ Change to a directory where we will build the Invenio sources. (The built files will be installed into different "target" directories later.) $ wget http://invenio-software.org/download/invenio-1.1.2.tar.gz $ wget http://invenio-software.org/download/invenio-1.1.2.tar.gz.md5 $ wget http://invenio-software.org/download/invenio-1.1.2.tar.gz.sig Fetch Invenio source tarball from the distribution server, together with MD5 checksum and GnuPG cryptographic signature files useful for verifying the integrity of the tarball. $ md5sum -c invenio-1.1.2.tar.gz.md5 Verify MD5 checksum. $ gpg --verify invenio-1.1.2.tar.gz.sig invenio-1.1.2.tar.gz Verify GnuPG cryptographic signature. Note that you may first have to import my public key into your keyring, if you haven't done that already: $ gpg --keyserver wwwkeys.eu.pgp.net --recv-keys 0xBA5A2B67 The output of the gpg --verify command should then read: Good signature from "Tibor Simko " You can safely ignore any trusted signature certification warning that may follow after the signature has been successfully verified. $ tar xvfz invenio-1.1.2.tar.gz Untar the distribution tarball. $ cd invenio-1.1.2 Go to the source directory. $ ./configure Configure Invenio software for building on this specific platform. You can use the following optional parameters: --prefix=/opt/invenio Optionally, specify the Invenio general installation directory (default is /opt/invenio). It will contain command-line binaries and program libraries containing the core Invenio functionality, but also store web pages, runtime log and cache information, document data files, etc. Several subdirs like `bin', `etc', `lib', or `var' will be created inside the prefix directory to this effect. Note that the prefix directory should be chosen outside of the Apache htdocs tree, since only one its subdirectory (prefix/var/www) is to be accessible directly via the Web (see below). Note that Invenio won't install to any other directory but to the prefix mentioned in this configuration line. - --with-python=/opt/python/bin/python2.4 + --with-python=/opt/python/bin/python2.7 Optionally, specify a path to some specific Python binary. This is useful if you have more than one Python installation on your system. If you don't set this option, then the first Python that will be found in your PATH will be chosen for running Invenio. --with-mysql=/opt/mysql/bin/mysql Optionally, specify a path to some specific MySQL client binary. This is useful if you have more than one MySQL installation on your system. If you don't set this option, then the first MySQL client executable that will be found in your PATH will be chosen for running Invenio. --with-clisp=/opt/clisp/bin/clisp Optionally, specify a path to CLISP executable. This is useful if you have more than one CLISP installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running Invenio. --with-cmucl=/opt/cmucl/bin/lisp Optionally, specify a path to CMUCL executable. This is useful if you have more than one CMUCL installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running Invenio. --with-sbcl=/opt/sbcl/bin/sbcl Optionally, specify a path to SBCL executable. This is useful if you have more than one SBCL installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running Invenio. --with-openoffice-python Optionally, specify the path to the Python interpreter embedded with OpenOffice.org. This is normally not contained in the normal path. If you don't specify this it won't be possible to use OpenOffice.org to convert from and to Microsoft Office and OpenOffice.org documents. This configuration step is mandatory. Usually, you do this step only once. (Note that if you are building Invenio not from a released tarball, but from the Git sources, then you have to generate the configure file via autotools: $ sudo aptitude install automake1.9 autoconf $ aclocal-1.9 $ automake-1.9 -a $ autoconf after which you proceed with the usual configure command.) $ make Launch the Invenio build. Since many messages are printed during the build process, you may want to run it in a fast-scrolling terminal such as rxvt or in a detached screen session. During this step all the pages and scripts will be pre-created and customized based on the config you have edited in the previous step. Note that on systems such as FreeBSD or Mac OS X you have to use GNU make ("gmake") instead of "make". $ make install Install the web pages, scripts, utilities and everything needed for Invenio runtime into respective installation directories, as specified earlier by the configure command. Note that if you are installing Invenio for the first time, you will be asked to create symbolic link(s) from Python's site-packages system-wide directory(ies) to the installation location. This is in order to instruct Python where to find Invenio's Python files. You will be hinted as to the exact command to use based on the parameters you have used in the configure command. $ make install-mathjax-plugin ## optional This will automatically download and install in the proper place MathJax, a JavaScript library to render LaTeX formulas in the client browser. Note that in order to enable the rendering you will have to set the variable CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS in invenio-local.conf to a suitable list of output format codes. For example: CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS = hd,hb $ make install-jquery-plugins ## optional This will automatically download and install in the proper place jQuery and related plugins. They are used for AJAX applications such as the record editor. Note that `unzip' is needed when installing jquery plugins. $ make install-ckeditor-plugin ## optional This will automatically download and install in the proper place CKeditor, a WYSIWYG Javascript-based editor (e.g. for the WebComment module). Note that in order to enable the editor you have to set the CFG_WEBCOMMENT_USE_RICH_EDITOR to True. $ make install-pdfa-helper-files ## optional This will automatically download and install in the proper place the helper files needed to create PDF/A files out of existing PDF files. $ make install-mediaelement ## optional This will automatically download and install the MediaElementJS HTML5 video player that is needed for videos on the DEMO site. $ make install-solrutils ## optional This will automatically download and install a Solr instance which can be used for full-text searching. See CFG_SOLR_URL variable in the invenio.conf. Note that the admin later has to take care of running init.d scripts which would start the Solr instance automatically. $ make install-js-test-driver ## optional This will automatically download and install JsTestDriver which is needed to run JS unit tests. Recommended for developers. 2b. Configuration ----------------- Once the basic software installation is done, we proceed to configuring your Invenio system. $ sudo chown -R www-data.www-data /opt/invenio For the sake of simplicity, let us assume that your Invenio installation will run under the `www-data' user process identity. The above command changes ownership of installed files to www-data, so that we shall run everything under this user identity from now on. For production purposes, you would typically enable Apache server to read all files from the installation place but to write only to the `var' subdirectory of your installation place. You could achieve this by configuring Unix directory group permissions, for example. $ sudo -u www-data emacs /opt/invenio/etc/invenio-local.conf Customize your Invenio installation. Please read the 'invenio.conf' file located in the same directory that contains the vanilla default configuration parameters of your Invenio installation. If you want to customize some of these parameters, you should create a file named 'invenio-local.conf' in the same directory where 'invenio.conf' lives and you should write there only the customizations that you want to be different from the vanilla defaults. Here is a realistic, minimalist, yet production-ready example of what you would typically put there: $ cat /opt/invenio/etc/invenio-local.conf [Invenio] CFG_SITE_NAME = John Doe's Document Server CFG_SITE_NAME_INTL_fr = Serveur des Documents de John Doe CFG_SITE_URL = http://your.site.com CFG_SITE_SECURE_URL = https://your.site.com CFG_SITE_ADMIN_EMAIL = john.doe@your.site.com CFG_SITE_SUPPORT_EMAIL = john.doe@your.site.com CFG_WEBALERT_ALERT_ENGINE_EMAIL = john.doe@your.site.com CFG_WEBCOMMENT_ALERT_ENGINE_EMAIL = john.doe@your.site.com CFG_WEBCOMMENT_DEFAULT_MODERATOR = john.doe@your.site.com CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL = john.doe@your.site.com CFG_BIBCATALOG_SYSTEM_EMAIL_ADDRESS = john.doe@your.site.com CFG_DATABASE_HOST = localhost CFG_DATABASE_NAME = invenio CFG_DATABASE_USER = invenio CFG_DATABASE_PASS = my123p$ss CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE = 1 You should override at least the parameters mentioned above in order to define some very essential runtime parameters such as the name of your document server (CFG_SITE_NAME and CFG_SITE_NAME_INTL_*), the visible URL of your document server (CFG_SITE_URL and CFG_SITE_SECURE_URL), the email address of the local Invenio administrator, comment moderator, and alert engine (CFG_SITE_SUPPORT_EMAIL, CFG_SITE_ADMIN_EMAIL, etc), and last but not least your database credentials (CFG_DATABASE_*). If this is a first installation of Invenio it is recommended you set the CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE variable to 1. If this is instead an upgrade from an existing installation don't add it until you have run: $ bibdocfile --fix-bibdocfsinfo-cache . The Invenio system will then read both the default invenio.conf file and your customized invenio-local.conf file and it will override any default options with the ones you have specifield in your local file. This cascading of configuration parameters will ease your future upgrades. If you want to have multiple Invenio instances for distributed video encoding, you need to share the same configuration amongs them and make some of the folders of the Invenio installation available for all nodes. Configure the allowed tasks for every node: CFG_BIBSCHED_NODE_TASKS = { "hostname_machine1" : ["bibindex", "bibupload", "bibreformat","webcoll", "bibtaskex", "bibrank", "oaiharvest", "oairepositoryupdater", "inveniogc", "webstatadmin", "bibclassify", "bibexport", "dbdump", "batchuploader", "bibauthorid", "bibtasklet"], "hostname_machine2" : ['bibencode',] } Share the following directories among Invenio instances: /var/tmp-shared hosts video uploads in a temporary form /var/tmp-shared/bibencode/jobs hosts new job files for the video encoding daemon /var/tmp-shared/bibencode/jobs/done hosts job files that have been processed by the daemon /var/data/files hosts fulltext and media files associated to records /var/data/submit hosts files created during submissions $ sudo -u www-data /opt/invenio/bin/inveniocfg --update-all Make the rest of the Invenio system aware of your invenio-local.conf changes. This step is mandatory each time you edit your conf files. $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-tables If you are installing Invenio for the first time, you have to create database tables. Note that this step checks for potential problems such as the database connection rights and may ask you to perform some more administrative steps in case it detects a problem. Notably, it may ask you to set up database access permissions, based on your configure values. If you are installing Invenio for the first time, you have to create a dedicated database on your MySQL server that the Invenio can use for its purposes. Please contact your MySQL administrator and ask him to execute the commands this step proposes you. At this point you should now have successfully completed the "make install" process. We continue by setting up the Apache web server. $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-bibfield-conf Load the configuration file of the BibField module. It will create `bibfield_config.py' file. (FIXME: When BibField becomes essential part of Invenio, this step should be later automatised so that people do not have to run it manually.) $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-webstat-conf Load the configuration file of webstat module. It will create the tables in the database for register customevents, such as basket hits. $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-apache-conf Running this command will generate Apache virtual host configurations matching your installation. You will be instructed to check created files (usually they are located under /opt/invenio/etc/apache/) and edit your httpd.conf to activate Invenio virtual hosts. If you are using Debian GNU/Linux ``Lenny'' or later, then you can do the following to create your SSL certificate and to activate your Invenio vhosts: ## make SSL certificate: $ sudo aptitude install ssl-cert $ sudo mkdir /etc/apache2/ssl $ sudo /usr/sbin/make-ssl-cert /usr/share/ssl-cert/ssleay.cnf \ /etc/apache2/ssl/apache.pem ## add Invenio web sites: $ sudo ln -s /opt/invenio/etc/apache/invenio-apache-vhost.conf \ /etc/apache2/sites-available/invenio $ sudo ln -s /opt/invenio/etc/apache/invenio-apache-vhost-ssl.conf \ /etc/apache2/sites-available/invenio-ssl ## disable Debian's default web site: $ sudo /usr/sbin/a2dissite default ## enable Invenio web sites: $ sudo /usr/sbin/a2ensite invenio $ sudo /usr/sbin/a2ensite invenio-ssl ## enable SSL module: $ sudo /usr/sbin/a2enmod ssl ## if you are using xsendfile module, enable it too: $ sudo /usr/sbin/a2enmod xsendfile If you are using another operating system, you should do the equivalent, for example edit your system-wide httpd.conf and put the following include statements: Include /opt/invenio/etc/apache/invenio-apache-vhost.conf Include /opt/invenio/etc/apache/invenio-apache-vhost-ssl.conf Note that you may need to adapt generated vhost file snippets to match your concrete operating system specifics. For example, the generated configuration snippet will preload Invenio WSGI daemon application upon Apache start up for faster site response. The generated configuration assumes that you are using mod_wsgi version 3 or later. If you are using the old legacy mod_wsgi version 2, then you would need to comment out the WSGIImportScript directive from the generated snippet, or else move the WSGI daemon setup to the top level, outside of the VirtualHost section. Note also that you may want to tweak the generated Apache vhost snippet for performance reasons, especially with respect to WSGIDaemonProcess parameters. For example, you can increase the number of processes from the default value `processes=5' if you have lots of RAM and if many concurrent users may access your site in parallel. However, note that you must use `threads=1' there, because Invenio WSGI daemon processes are not fully thread safe yet. This may change in the future. $ sudo /etc/init.d/apache2 restart Please ask your webserver administrator to restart the Apache server after the above "httpd.conf" changes. $ sudo -u www-data /opt/invenio/bin/inveniocfg --check-openoffice If you plan to support MS Office or Open Document Format files in your installation, you should check whether LibreOffice or OpenOffice.org is well integrated with Invenio by running the above command. You may be asked to create a temporary directory for converting office files with special ownership (typically as user nobody) and permissions. Note that you can do this step later. $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-demo-site This step is recommended to test your local Invenio installation. It should give you our "Atlantis Institute of Science" demo installation, exactly as you see it at . $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-demo-records Optionally, load some demo records to be able to test indexing and searching of your local Invenio demo installation. $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-unit-tests Optionally, you can run the unit test suite to verify the unit behaviour of your local Invenio installation. Note that this command should be run only after you have installed the whole system via `make install'. $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-regression-tests Optionally, you can run the full regression test suite to verify the functional behaviour of your local Invenio installation. Note that this command requires to have created the demo site and loaded the demo records. Note also that running the regression test suite may alter the database content with junk data, so that rebuilding the demo site is strongly recommended afterwards. $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-web-tests Optionally, you can run additional automated web tests running in a real browser. This requires to have Firefox with the Selenium IDE extension installed. $ sudo -u www-data /opt/invenio/bin/inveniocfg --remove-demo-records Optionally, remove the demo records loaded in the previous step, but keeping otherwise the demo collection, submission, format, and other configurations that you may reuse and modify for your own production purposes. $ sudo -u www-data /opt/invenio/bin/inveniocfg --drop-demo-site Optionally, drop also all the demo configuration so that you'll end up with a completely blank Invenio system. However, you may want to find it more practical not to drop the demo site configuration but to start customizing from there. $ firefox http://your.site.com/help/admin/howto-run In order to start using your Invenio installation, you can start indexing, formatting and other daemons as indicated in the "HOWTO Run" guide on the above URL. You can also use the Admin Area web interfaces to perform further runtime configurations such as the definition of data collections, document types, document formats, word indexes, etc. $ sudo ln -s /opt/invenio/etc/bash_completion.d/inveniocfg \ /etc/bash_completion.d/inveniocfg Optionally, if you are using Bash shell completion, then you may want to create the above symlink in order to configure completion for the inveniocfg command. Good luck, and thanks for choosing Invenio. - Invenio Development Team diff --git a/configure-tests.py b/configure-tests.py index d54d9ab45..69528d861 100644 --- a/configure-tests.py +++ b/configure-tests.py @@ -1,514 +1,514 @@ ## This file is part of Invenio. -## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN. +## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN. ## ## Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Test the suitability of Python core and the availability of various Python modules for running Invenio. Warn the user if there are eventual troubles. Exit status: 0 if okay, 1 if not okay. Useful for running from configure.ac. """ ## minimally recommended/required versions: -cfg_min_python_version = "2.4" +cfg_min_python_version = "2.6" cfg_max_python_version = "2.9.9999" cfg_min_mysqldb_version = "1.2.1_p2" ## 0) import modules needed for this testing: import string import sys import getpass import subprocess import re error_messages = [] warning_messages = [] def wait_for_user(msg): """Print MSG and prompt user for confirmation.""" try: raw_input(msg) except KeyboardInterrupt: print "\n\nInstallation aborted." sys.exit(1) except EOFError: print " (continuing in batch mode)" return ## 1) check Python version: if sys.version < cfg_min_python_version: error_messages.append( """ ******************************************************* ** ERROR: TOO OLD PYTHON DETECTED: %s ******************************************************* ** You seem to be using a too old version of Python. ** ** You must use at least Python %s. ** ** ** ** Note that if you have more than one Python ** ** installed on your system, you can specify the ** ** --with-python configuration option to choose ** ** a specific (e.g. non system wide) Python binary. ** ** ** ** Please upgrade your Python before continuing. ** ******************************************************* """ % (string.replace(sys.version, "\n", ""), cfg_min_python_version) ) if sys.version > cfg_max_python_version: error_messages.append( """ ******************************************************* ** ERROR: TOO NEW PYTHON DETECTED: %s ******************************************************* ** You seem to be using a too new version of Python. ** ** You must use at most Python %s. ** ** ** ** Perhaps you have downloaded and are installing an ** ** old Invenio version? Please look for more recent ** ** Invenio version or please contact the development ** ** team at about this ** ** problem. ** ** ** ** Installation aborted. ** ******************************************************* """ % (string.replace(sys.version, "\n", ""), cfg_max_python_version) ) ## 2) check for required modules: try: import MySQLdb import base64 import cPickle import cStringIO import cgi import copy import fileinput import getopt import sys if sys.hexversion < 0x2060000: import md5 else: import hashlib import marshal import os import pyparsing import signal import tempfile import time import traceback import unicodedata import urllib import zlib import wsgiref except ImportError, msg: error_messages.append(""" ************************************************* ** IMPORT ERROR %s ************************************************* ** Perhaps you forgot to install some of the ** ** prerequisite Python modules? Please look ** ** at our INSTALL file for more details and ** ** fix the problem before continuing! ** ************************************************* """ % msg ) ## 3) check for recommended modules: try: import rdflib except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that rdflib is needed only if you plan ** ** to work with the automatic classification of ** ** documents based on RDF-based taxonomies. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import pyRXP except ImportError, msg: warning_messages.append(""" ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that PyRXP is not really required but ** ** we recommend it for fast XML MARC parsing. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import dateutil except ImportError, msg: warning_messages.append(""" ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that dateutil is not really required but ** ** we recommend it for user-friendly date ** ** parsing. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import libxml2 except ImportError, msg: warning_messages.append(""" ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that libxml2 is not really required but ** ** we recommend it for XML metadata conversions ** ** and for fast XML parsing. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import libxslt except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that libxslt is not really required but ** ** we recommend it for XML metadata conversions. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import Gnuplot except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that Gnuplot.py is not really required but ** ** we recommend it in order to have nice download ** ** and citation history graphs on Detailed record ** ** pages. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import rauth except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that python-rauth is not really required ** ** but we recommend it in order to enable oauth ** ** based authentication. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import openid except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that python-openid is not really required ** ** but we recommend it in order to enable OpenID ** ** based authentication. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: import magic if not hasattr(magic, "open"): raise StandardError except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that magic module is not really required ** ** but we recommend it in order to have detailed ** ** content information about fulltext files. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) except StandardError: warning_messages.append( """ ***************************************************** ** IMPORT WARNING python-magic ***************************************************** ** The python-magic package you installed is not ** ** the one supported by Invenio. Please refer to ** ** the INSTALL file for more details. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ ) try: import reportlab except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that reportlab module is not really ** ** required, but we recommend it you want to ** ** enrich PDF with OCR information. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) try: try: import PyPDF2 except ImportError: import pyPdf except ImportError, msg: warning_messages.append( """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that pyPdf or pyPdf2 module is not really ** ** required, but we recommend it you want to ** ** enrich PDF with OCR information. ** ** ** ** You can safely continue installing Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg ) ## 4) check for versions of some important modules: if MySQLdb.__version__ < cfg_min_mysqldb_version: error_messages.append( """ ***************************************************** ** ERROR: PYTHON MODULE MYSQLDB %s DETECTED ***************************************************** ** You have to upgrade your MySQLdb to at least ** ** version %s. You must fix this problem ** ** before continuing. Please see the INSTALL file ** ** for more details. ** ***************************************************** """ % (MySQLdb.__version__, cfg_min_mysqldb_version) ) try: import Stemmer try: from Stemmer import algorithms except ImportError, msg: error_messages.append( """ ***************************************************** ** ERROR: STEMMER MODULE PROBLEM %s ***************************************************** ** Perhaps you are using an old Stemmer version? ** ** You must either remove your old Stemmer or else ** ** upgrade to Snowball Stemmer ** ** before continuing. Please see the INSTALL file ** ** for more details. ** ***************************************************** """ % (msg) ) except ImportError: pass # no prob, Stemmer is optional ## 5) check for Python.h (needed for intbitset): try: from distutils.sysconfig import get_python_inc path_to_python_h = get_python_inc() + os.sep + 'Python.h' if not os.path.exists(path_to_python_h): raise StandardError, "Cannot find %s" % path_to_python_h except StandardError, msg: error_messages.append( """ ***************************************************** ** ERROR: PYTHON HEADER FILE ERROR %s ***************************************************** ** You do not seem to have Python developer files ** ** installed (such as Python.h). Some operating ** ** systems provide these in a separate Python ** ** package called python-dev or python-devel. ** ** You must install such a package before ** ** continuing the installation process. ** ***************************************************** """ % (msg) ) ## 6) Check if ffmpeg is installed and if so, with the minimum configuration for bibencode try: try: process = subprocess.Popen('ffprobe', stderr=subprocess.PIPE, stdout=subprocess.PIPE) except OSError: raise StandardError, "FFMPEG/FFPROBE does not seem to be installed!" returncode = process.wait() output = process.communicate()[1] RE_CONFIGURATION = re.compile("(--enable-[a-z0-9\-]*)") CONFIGURATION_REQUIRED = ( '--enable-gpl', '--enable-version3', '--enable-nonfree', '--enable-libtheora', '--enable-libvorbis', '--enable-libvpx', '--enable-libopenjpeg' ) options = RE_CONFIGURATION.findall(output) if sys.version_info < (2, 6): import sets s = sets.Set(CONFIGURATION_REQUIRED) if not s.issubset(options): raise StandardError, options.difference(s) else: if not set(CONFIGURATION_REQUIRED).issubset(options): raise StandardError, set(CONFIGURATION_REQUIRED).difference(options) except StandardError, msg: warning_messages.append( """ ***************************************************** ** WARNING: FFMPEG CONFIGURATION MISSING %s ***************************************************** ** You do not seem to have FFmpeg configured with ** ** the minimum video codecs to run the demo site. ** ** Please install the necessary libraries and ** ** re-install FFmpeg according to the Invenio ** ** installation manual (INSTALL). ** ***************************************************** """ % (msg) ) if warning_messages: print """ ****************************************************** ** WARNING MESSAGES ** ****************************************************** """ for warning in warning_messages: print warning if error_messages: print """ ****************************************************** ** ERROR MESSAGES ** ****************************************************** """ for error in error_messages: print error if warning_messages and error_messages: print """ There were %(n_err)s error(s) found that you need to solve. Please see above, solve them, and re-run configure. Note that there are also %(n_wrn)s warnings you may want to look into. Aborting the installation. """ % {'n_wrn': len(warning_messages), 'n_err': len(error_messages)} sys.exit(1) elif error_messages: print """ There were %(n_err)s error(s) found that you need to solve. Please see above, solve them, and re-run configure. Aborting the installation. """ % {'n_err': len(error_messages)} sys.exit(1) elif warning_messages: print """ There were %(n_wrn)s warnings found that you may want to look into, solve, and re-run configure before you continue the installation. However, you can also continue the installation now and solve these issues later, if you wish. """ % {'n_wrn': len(warning_messages)} wait_for_user("Press ENTER to continue the installation...") diff --git a/requirements-extras.txt b/requirements-extras.txt new file mode 100644 index 000000000..4f81718b3 --- /dev/null +++ b/requirements-extras.txt @@ -0,0 +1,16 @@ +# More requirements files are needed, since e.g gnuplot-py import numpy in its setup.py, +# which means it has to be installed in a second step. +gnuplot-py==1.8 + +# Following packages are optional (if you do development you probably want to install them): +pylint +http://sourceforge.net/projects/pychecker/files/pychecker/0.8.19/pychecker-0.8.19.tar.gz/download +pep8 +selenium +winpdb +mock +ipython +cython +nose +nosexcover +flake8 diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 000000000..a200dbd1f --- /dev/null +++ b/requirements.txt @@ -0,0 +1,27 @@ +# Invenio requirements. +MySQL-python==1.2.4 +rdflib==2.4.2 +reportlab==2.5 +python-dateutil<=1.9999 +python-magic==0.4.2 +http://www.reportlab.com/ftp/pyRXP-1.16-daily-unix.tar.gz +numpy==1.7.0 +lxml==3.1.2 +mechanize==0.2.5 +python-Levenshtein==0.10.2 +pyPdf==1.13 +PyStemmer==1.3.0 +https://py-editdist.googlecode.com/files/py-editdist-0.3.tar.gz +feedparser==5.1.3 +BeautifulSoup==3.2.1 +beautifulsoup4==4.1.3 +python-twitter==0.8.7 +msgpack-python==0.3.0 +pyparsing==1.5.6 +requests +PyPDF2 +rauth +unidecode +python-openid +qrcode +PIL