diff --git a/INSTALL b/INSTALL index 7d0f08e49..fcf4f12eb 100644 --- a/INSTALL +++ b/INSTALL @@ -1,600 +1,614 @@ CDS Invenio v0.99.1 INSTALL =========================== About ===== This document specifies how to build, customize, and install CDS Invenio v0.99.1 for the first time. See RELEASE-NOTES if you are upgrading from a previous CDS Invenio release. Contents ======== 0. Prerequisites 1. Quick instructions for the impatient CDS Invenio admin 2. Detailed instructions for the patient CDS Invenio admin 0. Prerequisites ================ Here is the software you need to have around before you start installing CDS Invenio: a) Unix-like operating system. The main development and production platforms for CDS Invenio at CERN are GNU/Linux distributions SLC (RHEL), Debian, and Gentoo, but we also develop on FreeBSD and Mac OS X. Basically any Unix system supporting the software listed below should do. - Note that if you are using Debian "Sarge" GNU/Linux, you can + Note that if you are using Debian "Lenny" GNU/Linux, you can install most of the below-mentioned prerequisites and recommendations by running: - $ sudo apt-get install libapache2-mod-python2.3 python2.3-dev \ - apache2-mpm-prefork mysql-server-4.1 mysql-client-4.1 \ - python2.3-mysqldb python2.3-4suite python-simplejson \ - python2.3-xml python2.3-libxml2 python2.3-libxslt1 \ + $ sudo apt-get install python-dev apache2-mpm-prefork mysql-server \ + mysql-client python-mysqldb python-4suite python-simplejson \ + python-xml python-libxml2 python-libxslt1 \ rxp gnuplot xpdf-utils gs-common antiword catdoc \ - wv html2text ppthtml xlhtml clisp gettext + wv html2text ppthtml xlhtml clisp gettext libapache2-mod-wsgi You can also install the following packages: - $ sudo apt-get install python2.3-psyco sbcl cmucl + $ sudo apt-get install python-psyco sbcl cmucl The last three packages are not available on all Debian - "Sarge" GNU/Linux architectures (e.g. not on AMD64); you can + "Lenny" GNU/Linux architectures (e.g. not on AMD64); you can safely continue without them. Note that you can consult CDS Invenio wiki pages at <https://twiki.cern.ch/twiki/bin/view/CDS/Invenio> for more system-specific notes. Note that the web application server should run a Message Transfer Agent (MTA) such as Postfix so that CDS Invenio can email notification alerts or registration information to the end users, contact moderators and reviewers of submitted documents, inform administrators about various runtime system information, etc. b) MySQL server (may be on a remote machine), and MySQL client (must be available locally too). MySQL versions 4.1 or 5.0 are supported. Please set the variable "max_allowed_packet" in your "my.cnf" init file to at least 4M. You may also want to run your MySQL server natively in UTF-8 mode by setting "default-character-set=utf8" in various parts of your "my.cnf" file, such as in the "[mysql]" part and elsewhere. <http://mysql.com/> c) Apache 2 server, with support for loading DSO modules, and optionally with SSL support for HTTPS-secure user - authentication. Tested mainly with version 2.0.43 and above. - Apache 2.x is required for the mod_python module (see below). + authentication. <http://httpd.apache.org/> - d) Python v2.3 or above: + d) Python v2.4 or above: <http://python.org/> as well as the following Python modules: - (mandatory) MySQLdb (version >= 1.2.1_p2; see below) <http://sourceforge.net/projects/mysql-python> - (recommended) PyXML, for XML processing: <http://pyxml.sourceforge.net/topics/download.html> - (recommended) PyRXP, for very fast XML MARC processing: <http://www.reportlab.org/pyrxp.html> - (recommended) libxml2-python, for XML/XLST processing: <ftp://xmlsoft.org/libxml2/python/> - (recommended) simplejson, for AJAX apps: <http://undefined.org/python/#simplejson> Note that if you are using Python-2.6, you don't need to install simplejson, because the module is already included in the main Python distribution. - (recommended) Gnuplot.Py, for producing graphs: <http://gnuplot-py.sourceforge.net/> - (recommended) Snowball Stemmer, for stemming: <http://snowball.tartarus.org/wrappers/PyStemmer-1.0.1.tar.gz> - (recommended) py-editdist, for record merging: <http://www.mindrot.org/projects/py-editdist/> - (optional) 4suite, slower alternative to PyRXP and libxml2-python: <http://4suite.org/> - (optional) feedparser, for web journal creation: <http://feedparser.org/> - (optional) Psyco, if you are running on a 32-bit OS: <http://psyco.sourceforge.net/> - (optional) RDFLib, to use RDF ontologies and thesauri: <http://rdflib.net/> - (optional) mechanize, to run regression web test suite: <http://wwwsearch.sourceforge.net/mechanize/> Note: MySQLdb version 1.2.1_p2 or higher is recommended. If you are using an older version of MySQLdb, you may get into problems with character encoding. - e) mod_python Apache module. Minimal required version is - 3.3.1. - <http://www.modpython.org/> + e) mod_wsgi Apache module. + <http://code.google.com/p/modwsgi/> + + Note: for the time being, the WSGI daemon must be run with + threads=1, because Invenio is not fully thread safe yet. + This will come later. The Apache configuration example + snippets (created below) will use threads=1. + + Note: if you are using Python 2.4 or earlier, then you should + also install the wsgiref Python module, available from: + <http://pypi.python.org/pypi/wsgiref/> (As of Python 2.5 + this module is included in standard Python + distribution.) f) If you want to be able to extract references from PDF fulltext files, then you need to install pdftotext version 3 at least. <http://www.foolabs.com/xpdf/home.html> g) If you want to be able to search for words in the fulltext files (i.e. to have fulltext indexing) or to stamp submitted files, then you need as well to install some of the following tools: - for PDF file stamping: pdftk, pdf2ps <http://www.accesspdf.com/pdftk/> <http://www.cs.wisc.edu/~ghost/doc/AFPL/> - for PDF files: pdftotext or pstotext <http://www.foolabs.com/xpdf/home.html> <http://www.cs.wisc.edu/~ghost/doc/AFPL/> - for PostScript files: pstotext or ps2ascii <http://www.cs.wisc.edu/~ghost/doc/AFPL/> - for MS Word files: antiword, catdoc, or wvText <http://www.winfield.demon.nl/index.html> <http://www.ice.ru/~vitus/catdoc/index.html> <http://sourceforge.net/projects/wvware> - for MS PowerPoint files: pptHtml and html2text <http://packages.debian.org/stable/utils/ppthtml> <http://userpage.fu-berlin.de/~mbayer/tools/html2text.html> - for MS Excel files: xlhtml and html2text <http://chicago.sourceforge.net/xlhtml/> <http://userpage.fu-berlin.de/~mbayer/tools/html2text.html> h) If you have chosen to install fast XML MARC Python processors in the step d) above, then you have to install the parsers themselves: - (optional) RXP: <http://www.cogsci.ed.ac.uk/~richard/rxp.html> - (optional) 4suite: <http://4suite.org/> i) (recommended) Gnuplot, the command-line driven interactive plotting program. It is used to display download and citation history graphs on the Detailed record pages on the web interface. Note that Gnuplot must be compiled with PNG output support, that is, with the GD library. Note also that Gnuplot is not required, only recommended. <http://www.gnuplot.info/> j) (recommended) A Common Lisp implementation, such as CLISP, SBCL or CMUCL. It is used for the web server log analysing tool and the metadata checking program. Note that any of the three implementations CLISP, SBCL, or CMUCL will do. CMUCL produces fastest machine code, but it does not support UTF-8 yet. Pick up CLISP if you don't know what to do. Note that a Common Lisp implementation is not required, only recommended. <http://clisp.cons.org/> <http://www.cons.org/cmucl/> <http://sbcl.sourceforge.net/> k) GNU gettext, a set of tools that makes it possible to - translate the application in multiple languages. + translate the application in multiple languages. <http://www.gnu.org/software/gettext/> - This is available by default on many systems. + This is available by default on many systems. Note that the configure script checks whether you have all the prerequisite software installed and that it won't let you continue unless everything is in order. It also warns you if it cannot find some optional but recommended software. 1. Quick instructions for the impatient CDS Invenio admin ========================================================= 1a. Installation ---------------- $ cd /usr/local/src/ $ wget http://cdsware.cern.ch/download/cds-invenio-0.99.1.tar.gz $ wget http://cdsware.cern.ch/download/cds-invenio-0.99.1.tar.gz.md5 $ wget http://cdsware.cern.ch/download/cds-invenio-0.99.1.tar.gz.sig $ md5sum -v -c cds-invenio-0.99.1.tar.gz.md5 $ gpg --verify cds-invenio-0.99.1.tar.gz.sig cds-invenio-0.99.1.tar.gz $ tar xvfz cds-invenio-0.99.1.tar.gz $ cd cds-invenio-0.99.1 $ ./configure $ make $ make install $ make install-jsmath-plugin ## optional $ make install-jquery-plugins ## optional $ make install-fckeditor-plugin ## optional 1b. Configuration ----------------- $ emacs /opt/cds-invenio/etc/invenio.conf $ emacs /opt/cds-invenio/etc/invenio-local.conf $ /opt/cds-invenio/bin/inveniocfg --update-all $ /opt/cds-invenio/bin/inveniocfg --create-tables $ /opt/cds-invenio/bin/inveniocfg --load-webstat-conf $ /opt/cds-invenio/bin/inveniocfg --create-apache-conf $ sudo /path/to/apache/bin/apachectl graceful $ sudo chgrp -R www-data /opt/cds-invenio $ sudo chmod -R g+r /opt/cds-invenio $ sudo chmod -R g+rw /opt/cds-invenio/var $ sudo find /opt/cds-invenio -type d -exec chmod g+rxw {} \; $ /opt/cds-invenio/bin/inveniocfg --create-demo-site $ /opt/cds-invenio/bin/inveniocfg --load-demo-records $ /opt/cds-invenio/bin/inveniocfg --run-unit-tests $ /opt/cds-invenio/bin/inveniocfg --run-regression-tests $ /opt/cds-invenio/bin/inveniocfg --run-web-tests $ /opt/cds-invenio/bin/inveniocfg --remove-demo-records $ /opt/cds-invenio/bin/inveniocfg --drop-demo-site $ firefox http://your.site.com/help/admin/howto-run 2. Detailed instructions for the patient CDS Invenio admin ========================================================== 2a. Installation ---------------- The CDS Invenio uses standard GNU autoconf method to build and install its files. This means that you proceed as follows: $ cd /usr/local/src/ Change to a directory where we will configure and build the CDS Invenio. (The built files will be installed into different "target" directories later.) $ wget http://cdsware.cern.ch/download/cds-invenio-0.99.1.tar.gz $ wget http://cdsware.cern.ch/download/cds-invenio-0.99.1.tar.gz.md5 $ wget http://cdsware.cern.ch/download/cds-invenio-0.99.1.tar.gz.sig Fetch CDS Invenio source tarball from the CDS Software Consortium distribution server, together with MD5 checksum and GnuPG cryptographic signature files useful for verifying the integrity of the tarball. $ md5sum -v -c cds-invenio-0.99.1.tar.gz.md5 Verify MD5 checksum. $ gpg --verify cds-invenio-0.99.1.tar.gz.sig cds-invenio-0.99.1.tar.gz Verify GnuPG cryptographic signature. Note that you may first have to import my public key into your keyring, if you haven't done that already: $ gpg --keyserver wwwkeys.eu.pgp.net --recv-keys 0xBA5A2B67 The output of the gpg --verify command should then read: Good signature from "Tibor Simko <tibor@simko.info>" You can safely ignore any trusted signature certification warning that may follow after the signature has been successfully verified. $ tar xvfz cds-invenio-0.99.1.tar.gz Untar the distribution tarball. $ cd cds-invenio-0.99.1 Go to the source directory. $ ./configure Configure CDS Invenio software for building on this specific platform. You can use the following optional parameters: --prefix=/opt/cds-invenio Optionally, specify the CDS Invenio general installation directory (default is /opt/cds-invenio). It will contain command-line binaries and program libraries containing the core CDS Invenio functionality, but also store web pages, runtime log and cache information, document data files, etc. Several subdirs like `bin', `etc', `lib', or `var' will be created inside the prefix directory to this effect. Note that the prefix directory should be chosen outside of the Apache htdocs tree, since only one its subdirectory (prefix/var/www) is to be accessible directly via the Web (see below). Note that CDS Invenio won't install to any other directory but to the prefix mentioned in this configuration line. - --with-python=/opt/python/bin/python2.3 + --with-python=/opt/python/bin/python2.4 Optionally, specify a path to some specific Python binary. This is useful if you have more than one Python installation on your system. If you don't set this option, then the first Python that will be found in your PATH will be chosen for running CDS Invenio. --with-mysql=/opt/mysql/bin/mysql Optionally, specify a path to some specific MySQL client binary. This is useful if you have more than one MySQL installation on your system. If you don't set this option, then the first MySQL client executable that will be found in your PATH will be chosen for running CDS Invenio. --with-clisp=/opt/clisp/bin/clisp Optionally, specify a path to CLISP executable. This is useful if you have more than one CLISP installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running CDS Invenio. --with-cmucl=/opt/cmucl/bin/lisp Optionally, specify a path to CMUCL executable. This is useful if you have more than one CMUCL installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running CDS Invenio. --with-sbcl=/opt/sbcl/bin/sbcl Optionally, specify a path to SBCL executable. This is useful if you have more than one SBCL installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running CDS Invenio. This configuration step is mandatory. Usually, you do this step only once. (Note that if you prefer to build CDS Invenio out of its source tree, you may run the above configure command like this: mkdir build && cd build && ../configure --prefix=... FIXME: this is not working right now as per the introduction of intbitset_setup.py.) $ make Launch the CDS Invenio build. Since many messages are printed during the build process, you may want to run it in a fast-scrolling terminal such as rxvt or in a detached screen session. During this step all the pages and scripts will be pre-created and customized based on the config you have edited in the previous step. Note that on systems such as FreeBSD or Mac OS X you have to use GNU make ("gmake") instead of "make". $ make install Install the web pages, scripts, utilities and everything needed for CDS Invenio runtime into respective installation directories, as specified earlier by the configure command. Note that if you are installing CDS Invenio for the first time, you will be asked to create symbolic link(s) from Python's site-packages system-wide directory(ies) to the installation location. This is in order to instruct Python where to find CDS Invenio's Python files. You will be hinted as to the exact command to use based on the parameters you have used in the configure command. $ make install-jsmath-plugin ## optional This will automatically download and install in the proper place jsMath, a Javascript library to render LaTeX formulas in the client browser. Check that the plugin files have proper permissions, i.e. that Apache can read them. E.g.: $ sudo chgrp -R www-data /opt/cds-invenio/var/www/ $ sudo chmod -R g+r /opt/cds-invenio/var/www/ Note that in order to enable the rendering you will have to set the variable CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS in invenio.conf to a suitable list of output format codes. For example: CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS = hd,hb $ make install-jquery-plugins ## optional This will automatically download and install in the proper place jQuery and related plugins. They are used for AJAX applications such as the record editor. Check that the plugin files have proper permissions, i.e. that Apache can read them. E.g.: $ sudo chgrp -R www-data /opt/cds-invenio/var/www/ $ sudo chmod -R g+r /opt/cds-invenio/var/www/ $ make install-fckeditor-plugin ## optional This will automatically download and install in the proper place FCKeditor, a WYSIWYG Javascript-based editor (e.g. for the WebComment module). Check that the plugin files have proper permissions, i.e. that Apache can read them. E.gg: $ sudo chgrp -R www-data /opt/cds-invenio/var/www/ $ sudo chmod -R g+r /opt/cds-invenio/var/www/ Note that in order to enable the editor you have to set the CFG_WEBCOMMENT_USE_FCKEDITOR to True. 2b. Configuration ----------------- Once the basic software installation is done, we proceed to configuring your Invenio system. $ emacs /opt/cds-invenio/etc/invenio.conf $ emacs /opt/cds-invenio/etc/invenio-local.conf Customize your CDS Invenio installation. The 'invenio.conf' file contains the vanilla default configuration parameters of a CDS Invenio installation, as coming from the distribution. You could in principle go ahead and change the values according to your local needs. However, you can also create a file named 'invenio-local.conf' in the same directory where 'invenio.conf' lives and put there only the localizations you need to have different from the default ones. For example: $ cat /opt/cds-invenio/etc/invenio-local.conf [Invenio] CFG_SITE_URL = http://your.site.com CFG_SITE_SECURE_URL = https://your.site.com CFG_SITE_ADMIN_EMAIL = john.doe@your.site.com CFG_SITE_SUPPORT_EMAIL = john.doe@your.site.com The Invenio system will then read both the default invenio.conf file and your customized invenio-local.conf file and it will override any default options with the ones you have set in your local file. This cascading of configuration parameters will ease you future upgrades. You should override at least the parameters from the top of invenio.conf file in order to define some very essential runtime parameters such as the visible URL of your document server (look for CFG_SITE_URL and CFG_SITE_SECURE_URL), the database credentials (look for CFG_DATABASE_*), the name of your document server (look for CFG_SITE_NAME and CFG_SITE_NAME_INTL_*), or the email address of the local CDS Invenio administrator (look for CFG_SITE_SUPPORT_EMAIL and CFG_SITE_ADMIN_EMAIL). $ /opt/cds-invenio/bin/inveniocfg --update-all Make the rest of the Invenio system aware of your invenio.conf changes. This step is mandatory each time you edit your conf files. $ /opt/cds-invenio/bin/inveniocfg --create-tables If you are installing CDS Invenio for the first time, you have to create database tables. Note that this step checks for potential problems such as the database connection rights and may ask you to perform some more administrative steps in case it detects a problem. Notably, it may ask you to set up database access permissions, based on your configure values. If you are installing CDS Invenio for the first time, you have to create a dedicated database on your MySQL server that the CDS Invenio can use for its purposes. Please contact your MySQL administrator and ask him to execute the commands this step proposes you. At this point you should now have successfully completed the "make install" process. We continue by setting up the Apache web server. $ /opt/cds-invenio/bin/inveniocfg --load-webstat-conf Load the configuration file of webstat module. It will create the tables in the database for register customevents, such as basket hits. $ /opt/cds-invenio/bin/inveniocfg --create-apache-conf Running this command will generate Apache virtual host configurations matching your installation. You will be instructed to check created files (usually they are located under /opt/cds-invenio/etc/apache/) and edit your httpd.conf to put the following include statements: Include /opt/cds-invenio/etc/apache/invenio-apache-vhost.conf Include /opt/cds-invenio/etc/apache/invenio-apache-vhost-ssl.conf + Note that you may want to tweak the generated example + configurations, especially with respect to the + WSGIDaemonProcess parameters. E.g. increase the `processes' + parameter if you have lots of RAM and many concurrent users + accessing your site in parallel. + $ sudo /path/to/apache/bin/apachectl graceful Please ask your webserver administrator to restart the Apache server after the above "httpd.conf" changes. $ sudo chgrp -R www-data /opt/cds-invenio $ sudo chmod -R g+r /opt/cds-invenio $ sudo chmod -R g+rw /opt/cds-invenio/var $ sudo find /opt/cds-invenio -type d -exec chmod g+rxw {} \; One more superuser step, because we need to enable Apache server to read files from the installation place and to write some log information and to cache interesting entities inside the "var" subdirectory of our CDS Invenio installation directory. Here we assumed that your Apache server processes are run under "www-data" group. Change this appropriately for your system. Moreover, note that if you are using SELinux extensions (e.g. on Fedora Core 6), you may have to check and enable the write access of Apache user there too. After these admin-level tasks to be performed as root, let's now go back to finish the installation of the CDS Invenio. $ /opt/cds-invenio/bin/inveniocfg --create-demo-site This step is recommended to test your local CDS Invenio installation. It should give you our "Atlantis Institute of Science" demo installation, exactly as you see it at <http://invenio-demo.cern.ch/>. $ /opt/cds-invenio/bin/inveniocfg --load-demo-records Optionally, load some demo records to be able to test indexing and searching of your local CDS Invenio demo installation. $ /opt/cds-invenio/bin/inveniocfg --run-unit-tests Optionally, you can run the unit test suite to verify the unit behaviour of your local CDS Invenio installation. Note that this command should be run only after you have installed the whole system via `make install'. $ /opt/cds-invenio/bin/inveniocfg --run-regression-tests Optionally, you can run the full regression test suite to verify the functional behaviour of your local CDS Invenio installation. Note that this command requires to have created the demo site and loaded the demo records. Note also that running the regression test suite may alter the database content with junk data, so that rebuilding the demo site is strongly recommended afterwards. $ /opt/cds-invenio/bin/inveniocfg --run-web-tests Optionally, you can run additional automated web tests running in a real browser. This requires to have Firefox with the Selenium IDE extension installed. <http://en.www.mozilla.com/en/firefox/> <http://selenium-ide.openqa.org/> $ /opt/cds-invenio/bin/inveniocfg --remove-demo-records Optionally, remove the demo records loaded in the previous step, but keeping otherwise the demo collection, submission, format, and other configurations that you may reuse and modify for your own production purposes. $ /opt/cds-invenio/bin/inveniocfg --drop-demo-site Optionally, drop also all the demo configuration so that you'll end up with a completely blank CDS Invenio system. However, you may want to find it more practical not to drop the demo site configuration but to start customizing from there. $ firefox http://your.site.com/help/admin/howto-run In order to start using your CDS Invenio installation, you can start indexing, formatting and other daemons as indicated in the "HOWTO Run" guide on the above URL. You can also use the Admin Area web interfaces to perform further runtime configurations such as the definition of data collections, document types, document formats, word indexes, etc. Good luck, and thanks for choosing CDS Invenio. - CDS Development Group <cds.support@cern.ch> <http://cdsware.cern.ch/> diff --git a/configure-tests.py b/configure-tests.py index afd3d416c..930eae1a7 100644 --- a/configure-tests.py +++ b/configure-tests.py @@ -1,314 +1,295 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Test the suitability of Python core and the availability of various Python modules for running CDS Invenio. Warn the user if there are eventual troubles. Exit status: 0 if okay, 1 if not okay. Useful for running from configure.ac. """ ## minimally recommended/required versions: -cfg_min_python_version = "2.3" +cfg_min_python_version = "2.4" cfg_min_mysqldb_version = "1.2.1_p2" -cfg_min_mod_python_version = "3.3.1" ## 0) import modules needed for this testing: import string import sys import getpass def wait_for_user(msg): """Print MSG and prompt user for confirmation.""" try: raw_input(msg) except KeyboardInterrupt: print "\n\nInstallation aborted." sys.exit(1) except EOFError: print " (continuing in batch mode)" return ## 1) check Python version: if sys.version < cfg_min_python_version: print """ ******************************************************* ** ERROR: OLD PYTHON DETECTED: %s ******************************************************* ** You seem to be using an old version of Python. ** ** You must use at least Python %s. ** ** ** ** Note that if you have more than one Python ** ** installed on your system, you can specify the ** ** --with-python configuration option to choose ** ** a specific (e.g. non system wide) Python binary. ** ** ** ** Please upgrade your Python before continuing. ** ******************************************************* """ % (string.replace(sys.version, "\n", ""), cfg_min_python_version) sys.exit(1) ## 2) check for required modules: try: import MySQLdb - import mod_python import base64 import cPickle import cStringIO import cgi import copy import fileinput import getopt import sys if sys.hexversion < 0x2060000: import md5 else: import hashlib import marshal import os import signal import tempfile import time import traceback import unicodedata import urllib import zlib + import wsgiref except ImportError, msg: print """ ************************************************* ** IMPORT ERROR %s ************************************************* ** Perhaps you forgot to install some of the ** ** prerequisite Python modules? Please look ** ** at our INSTALL file for more details and ** ** fix the problem before continuing! ** ************************************************* """ % msg sys.exit(1) ## 3) check for recommended modules: try: if (2**31 - 1) == sys.maxint: # check for Psyco since we seem to run in 32-bit environment import psyco else: # no need to advise on Psyco on 64-bit systems pass except ImportError, msg: print """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that Psyco is not really required but we ** ** recommend it for faster CDS Invenio operation ** ** if you are running in 32-bit operating system. ** ** ** ** You can safely continue installing CDS Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your CDS Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg wait_for_user("Press ENTER to continue the installation...") try: import rdflib except ImportError, msg: print """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that rdflib is needed only if you plan ** ** to work with the automatic classification of ** ** documents based on RDF-based taxonomies. ** ** ** ** You can safely continue installing CDS Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your CDS Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg wait_for_user("Press ENTER to continue the installation...") try: import pyRXP except ImportError, msg: print """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that PyRXP is not really required but ** ** we recommend it for fast XML MARC parsing. ** ** ** ** You can safely continue installing CDS Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your CDS Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg wait_for_user("Press ENTER to continue the installation...") try: import libxml2 except ImportError, msg: print """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that libxml2 is not really required but ** ** we recommend it for XML metadata conversions ** ** and for fast XML parsing. ** ** ** ** You can safely continue installing CDS Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your CDS Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg wait_for_user("Press ENTER to continue the installation...") try: import libxslt except ImportError, msg: print """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that libxslt is not really required but ** ** we recommend it for XML metadata conversions. ** ** ** ** You can safely continue installing CDS Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your CDS Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg wait_for_user("Press ENTER to continue the installation...") try: import Gnuplot except ImportError, msg: print """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that Gnuplot.py is not really required but ** ** we recommend it in order to have nice download ** ** and citation history graphs on Detailed record ** ** pages. ** ** ** ** You can safely continue installing CDS Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your CDS Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg wait_for_user("Press ENTER to continue the installation...") try: import magic except ImportError, msg: print """ ***************************************************** ** IMPORT WARNING %s ***************************************************** ** Note that magic module is not really required ** ** but we recommend it in order to have detailed ** ** content information about fulltext files. ** ** ** ** You can safely continue installing CDS Invenio ** ** now, and add this module anytime later. (I.e. ** ** even after your CDS Invenio installation is put ** ** into production.) ** ***************************************************** """ % msg wait_for_user("Press ENTER to continue the installation...") ## 4) check for versions of some important modules: if MySQLdb.__version__ < cfg_min_mysqldb_version: print """ ***************************************************** ** ERROR: PYTHON MODULE MYSQLDB %s DETECTED ***************************************************** ** You have to upgrade your MySQLdb to at least ** ** version %s. You must fix this problem ** ** before continuing. Please see the INSTALL file ** ** for more details. ** ***************************************************** """ % (MySQLdb.__version__, cfg_min_mysqldb_version) sys.exit(1) -try: - current_mod_python_version = mod_python.version -except AttributeError: - # old versions did not have mod_python.version - current_mod_python_version = ' ' # space is less then 3 -if current_mod_python_version < cfg_min_mod_python_version: - print """ - ***************************************************** - ** ERROR: MOD_PYTHON OLD VERSION DETECTED: %s - ***************************************************** - ** You have to upgrade your mod_python to at least ** - ** version %s. You must fix this problem ** - ** before continuing. Please see the INSTALL file ** - ** for more details. ** - ***************************************************** - """ % (current_mod_python_version, cfg_min_mod_python_version) - sys.exit(1) - try: import Stemmer try: from Stemmer import algorithms except ImportError, msg: print """ ***************************************************** ** ERROR: STEMMER MODULE PROBLEM %s ***************************************************** ** Perhaps you are using an old Stemmer version? ** ** You must either remove your old Stemmer or else ** ** upgrade to Snowball Stemmer ** <http://snowball.tartarus.org/wrappers/PyStemmer-1.0.1.tar.gz> ** before continuing. Please see the INSTALL file ** ** for more details. ** ***************************************************** """ % (msg) sys.exit(1) except ImportError: pass # no prob, Stemmer is optional ## 5) check for Python.h (needed for intbitset): try: from distutils.sysconfig import get_python_inc path_to_python_h = get_python_inc() + os.sep + 'Python.h' if not os.path.exists(path_to_python_h): raise StandardError, "Cannot find %s" % path_to_python_h except StandardError, msg: print """ ***************************************************** ** ERROR: PYTHON HEADER FILE ERROR %s ***************************************************** ** You do not seem to have Python developer files ** ** installed (such as Python.h). Some operating ** ** systems provide these in a separate Python ** ** package called python-dev or python-devel. ** ** You must install such a package before ** ** continuing the installation process. ** ***************************************************** """ % (msg) sys.exit(1) diff --git a/modules/bibharvest/lib/oai_repository_webinterface.py b/modules/bibharvest/lib/oai_repository_webinterface.py index a45330adb..531a13928 100644 --- a/modules/bibharvest/lib/oai_repository_webinterface.py +++ b/modules/bibharvest/lib/oai_repository_webinterface.py @@ -1,154 +1,154 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """CDS Invenio OAI provider interface, compliant with OAI-PMH/2.0""" __revision__ = "$Id$" import os import urllib import time -from mod_python import apache +from invenio import webinterface_handler_wsgi_utils as apache from invenio import oai_repository_server from invenio.config import CFG_CACHEDIR, CFG_OAI_SLEEP from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory class WebInterfaceOAIProviderPages(WebInterfaceDirectory): """Defines the set of /oai2d OAI provider pages.""" _exports = [''] def __call__(self, req, form): "OAI repository interface" # Clean input arguments. The protocol specifies that an error # has to be returned if the same argument is specified several # times. Eg: # oai2d?verb=ListIdentifiers&metadataPrefix=marcxml&metadataPrefix=marcxml # So keep the arguments as list for now so that check_argd can # return an error if needed (check_argd also transforms these # lists into strings) argd = wash_urlargd(form, {'verb': (list, []), 'metadataPrefix': (list, []), 'from': (list, []), 'until': (list, []), 'set': (list, []), 'identifier': (list, []), 'resumptionToken': (list, []), }) ## wash_urlargd(..) function cleaned everything, but also added ## unwanted parameters. Remove them now for param in argd.keys(): if not param in form and param != 'verb': del argd[param] ## wash_urlargd(..) function also removed unknown parameters ## that we would like to keep in order to send back an error ## as required by the protocol. But we do not need that value, ## so set it to empty string. for param in form.keys(): if param not in argd.keys(): argd[param] = '' ## But still remove 'ln' parameter that was automatically added. if argd.has_key('ln'): del argd['ln'] ## check request for OAI compliancy ## also transform all the list arguments into string oai_error = oai_repository_server.check_argd(argd) ## check availability (OAI requests for Identify, ListSets and ## ListMetadataFormats are served immediately, otherwise we ## shall wait for CFG_OAI_SLEEP seconds between requests): if os.path.exists("%s/RTdata/RTdata" % CFG_CACHEDIR) and (argd['verb'] not in ["Identify", "ListMetadataFormats", "ListSets"]): time_gap = int(time.time() - os.path.getmtime("%s/RTdata/RTdata" % CFG_CACHEDIR)) if(time_gap < CFG_OAI_SLEEP): - req.err_headers_out["Status-Code"] = "503" - req.err_headers_out["Retry-After"] = "%d" % (CFG_OAI_SLEEP - time_gap) + req.headers_out["Status-Code"] = "503" + req.headers_out["Retry-After"] = "%d" % (CFG_OAI_SLEEP - time_gap) req.status = apache.HTTP_SERVICE_UNAVAILABLE return "Retry after %d seconds" % (CFG_OAI_SLEEP - time_gap) command = "touch %s/RTdata/RTdata" % CFG_CACHEDIR os.system(command) ## construct args (argd string equivalent) for the ## oai_repository_server business logic (later it may be good if it ## takes argd directly): args = urllib.urlencode(argd) ## create OAI response req.content_type = "text/xml" req.send_http_header() if oai_error == "": ## OAI Identify if argd['verb'] == "Identify": req.write(oai_repository_server.oaiidentify(args, script_url=req.uri)) ## OAI ListSets elif argd['verb'] == "ListSets": req.write(oai_repository_server.oailistsets(args)) ## OAI ListIdentifiers elif argd['verb'] == "ListIdentifiers": req.write(oai_repository_server.oailistidentifiers(args)) ## OAI ListRecords elif argd['verb'] == "ListRecords": req.write(oai_repository_server.oailistrecords(args)) ## OAI GetRecord elif argd['verb'] == "GetRecord": req.write(oai_repository_server.oaigetrecord(args)) ## OAI ListMetadataFormats elif argd['verb'] == "ListMetadataFormats": req.write(oai_repository_server.oailistmetadataformats(args)) ## Unknown verb else: req.write(oai_repository_server.oai_error("badVerb","Illegal OAI verb")) ## OAI error else: req.write(oai_repository_server.oai_header(args,"")) req.write(oai_error) req.write(oai_repository_server.oai_footer("")) return "\n" ## Return the same page wether we ask for /oai2d?verb or /oai2d/?verb index = __call__ diff --git a/modules/bibrank/lib/bibrankadminlib.py b/modules/bibrank/lib/bibrankadminlib.py index 2d8a46f9f..ef645f061 100644 --- a/modules/bibrank/lib/bibrankadminlib.py +++ b/modules/bibrank/lib/bibrankadminlib.py @@ -1,1040 +1,1037 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## Youshould have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """CDS Invenio BibRank Administrator Interface.""" __revision__ = "$Id$" import cgi import re import os import ConfigParser from zlib import compress,decompress import marshal -try: - from mod_python import apache -except ImportError: - pass +from invenio import webinterface_handler_wsgi_utils as apache from invenio.config import \ CFG_SITE_LANG, \ CFG_ETCDIR, \ CFG_VERSION, \ CFG_SITE_URL import invenio.access_control_engine as acce from invenio.messages import language_list_long from invenio.dbquery import run_sql from invenio.webpage import page, pageheaderonly, pagefooteronly from invenio.webuser import getUid, get_email def getnavtrail(previous = ''): navtrail = """<a class="navtrail" href="%s/help/admin">Admin Area</a> """ % (CFG_SITE_URL,) navtrail = navtrail + previous return navtrail def check_user(req, role, adminarea=2, authorized=0): (auth_code, auth_message) = is_adminuser(req, role) if not authorized and auth_code != 0: return ("false", auth_message) return ("", auth_message) def is_adminuser(req, role): """check if user is a registered administrator. """ return acce.acc_authorize_action(req, role) def perform_index(ln=CFG_SITE_LANG): """create the bibrank main area menu page.""" header = ['Code', 'Translations', 'Collections', 'Rank method'] rnk_list = get_def_name('', "rnkMETHOD") actions = [] for (rnkID, name) in rnk_list: actions.append([name]) for col in [(('Modify', 'modifytranslations'),), (('Modify', 'modifycollection'),), (('Show Details', 'showrankdetails'), ('Modify', 'modifyrank'), ('Delete', 'deleterank'))]: actions[-1].append('<a href="%s/admin/bibrank/bibrankadmin.py/%s?rnkID=%s&ln=%s">%s</a>' % (CFG_SITE_URL, col[0][1], rnkID, ln, col[0][0])) for (str, function) in col[1:]: actions[-1][-1] += ' / <a href="%s/admin/bibrank/bibrankadmin.py/%s?rnkID=%s&ln=%s">%s</a>' % (CFG_SITE_URL, function, rnkID, ln, str) output = """ <a href="%s/admin/bibrank/bibrankadmin.py/addrankarea?ln=%s">Add new rank method</a><br /><br /> """ % (CFG_SITE_URL, ln) output += tupletotable(header=header, tuple=actions) return addadminbox("""Overview of rank methods <small>[<a title="See guide" href="%s/help/admin/bibrank-admin-guide#mi">?</a>]</small>""" % CFG_SITE_URL, datalist=[output, '']) def perform_modifycollection(rnkID='', ln=CFG_SITE_LANG, func='', colID='', confirm=0): """Modify which collections the rank method is visible to""" output = "" subtitle = "" if rnkID: rnkNAME = get_def_name(rnkID, "rnkMETHOD")[0][1] if func in ["0", 0] and confirm in ["1", 1]: finresult = attach_col_rnk(rnkID, colID) elif func in ["1", 1] and confirm in ["1", 1]: finresult = detach_col_rnk(rnkID, colID) if colID: colNAME = get_def_name(colID, "collection")[0][1] subtitle = """Step 1 - Select collection to enable/disable rank method '%s' for""" % rnkNAME output = """ <dl> <dt>The rank method is currently enabled for these collections:</dt> <dd> """ col_list = get_rnk_col(rnkID, ln) if not col_list: output += """No collections""" else: for (id, name) in col_list: output += """%s, """ % name output += """</dd> </dl> """ col_list = get_def_name('', "collection") col_rnk = dict(get_rnk_col(rnkID)) col_list = filter(lambda x: not col_rnk.has_key(x[0]), col_list) if col_list: text = """ <span class="adminlabel">Enable for:</span> <select name="colID" class="admin_w200"> <option value="">- select collection -</option> """ for (id, name) in col_list: text += """<option value="%s" %s>%s</option>""" % (id, (func in ["0", 0] and confirm in ["0", 0] and colID and int(colID) == int(id)) and 'selected="selected"' or '' , name) text += """</select>""" output += createhiddenform(action="modifycollection", text=text, button="Enable", rnkID=rnkID, ln=ln, func=0, confirm=1) if confirm in ["0", 0] and func in ["0", 0] and colID: subtitle = "Step 2 - Confirm to enable rank method for the chosen collection" text = "<b><p>Please confirm to enable rank method '%s' for the collection '%s'</p></b>" % (rnkNAME, colNAME) output += createhiddenform(action="modifycollection", text=text, button="Confirm", rnkID=rnkID, ln=ln, colID=colID, func=0, confirm=1) elif confirm in ["1", 1] and func in ["0", 0] and colID: subtitle = "Step 3 - Result" output += write_outcome(finresult) elif confirm not in ["0", 0] and func in ["0", 0]: output += """<b><span class="info">Please select a collection.</span></b>""" col_list = get_rnk_col(rnkID, ln) if col_list: text = """ <span class="adminlabel">Disable for:</span> <select name="colID" class="admin_w200"> <option value="">- select collection -</option> """ for (id, name) in col_list: text += """<option value="%s" %s>%s</option>""" % (id, (func in ["1", 1] and confirm in ["0", 0] and colID and int(colID) == int(id)) and 'selected="selected"' or '' , name) text += """</select>""" output += createhiddenform(action="modifycollection", text=text, button="Disable", rnkID=rnkID, ln=ln, func=1, confirm=1) if confirm in ["1", 1] and func in ["1", 1] and colID: subtitle = "Step 3 - Result" output += write_outcome(finresult) elif confirm not in ["0", 0] and func in ["1", 1]: output += """<b><span class="info">Please select a collection.</span></b>""" body = [output] return addadminbox(subtitle + """ <small>[<a title="See guide" href="%s/help/admin/bibrank-admin-guide#mc">?</a>]</small>""" % CFG_SITE_URL, body) def perform_modifytranslations(rnkID, ln, sel_type, trans, confirm, callback='yes'): """Modify the translations of a rank method""" output = '' subtitle = '' langs = get_languages() langs.sort() if confirm in ["2", 2] and rnkID: finresult = modify_translations(rnkID, langs, sel_type, trans, "rnkMETHOD") rnk_name = get_def_name(rnkID, "rnkMETHOD")[0][1] rnk_dict = dict(get_i8n_name('', ln, get_rnk_nametypes()[0][0], "rnkMETHOD")) if rnkID and rnk_dict.has_key(int(rnkID)): rnkID = int(rnkID) subtitle = """<a name="3">3. Modify translations for rank method '%s'</a>""" % rnk_name if type(trans) is str: trans = [trans] if sel_type == '': sel_type = get_rnk_nametypes()[0][0] header = ['Language', 'Translation'] actions = [] text = """ <span class="adminlabel">Name type</span> <select name="sel_type" class="admin_w200"> """ types = get_rnk_nametypes() if len(types) > 1: for (key, value) in types: text += """<option value="%s" %s>%s""" % (key, key == sel_type and 'selected="selected"' or '', value) trans_names = get_name(rnkID, ln, key, "rnkMETHOD") if trans_names and trans_names[0][0]: text += ": %s" % trans_names[0][0] text += "</option>" text += """</select>""" output += createhiddenform(action="modifytranslations", text=text, button="Select", rnkID=rnkID, ln=ln, confirm=0) if confirm in [-1, "-1", 0, "0"]: trans = [] for key, value in langs: try: trans_names = get_name(rnkID, key, sel_type, "rnkMETHOD") trans.append(trans_names[0][0]) except StandardError, e: trans.append('') for nr in range(0,len(langs)): actions.append(["%s %s" % (langs[nr][1], (langs[nr][0]==CFG_SITE_LANG and '<small>(def)</small>' or ''))]) actions[-1].append('<input type="text" name="trans" size="30" value="%s"/>' % trans[nr]) text = tupletotable(header=header, tuple=actions) output += createhiddenform(action="modifytranslations", text=text, button="Modify", rnkID=rnkID, sel_type=sel_type, ln=ln, confirm=2) if sel_type and len(trans) and confirm in ["2", 2]: output += write_outcome(finresult) body = [output] return addadminbox(subtitle + """ <small>[<a title="See guide" href="%s/help/admin/bibrank-admin-guide#mt">?</a>]</small>""" % CFG_SITE_URL, body) def perform_addrankarea(rnkcode='', ln=CFG_SITE_LANG, template='', confirm=-1): """form to add a new rank method with these values:""" subtitle = 'Step 1 - Create new rank method' output = """ <dl> <dt>BibRank code:</dt> <dd>A unique code that identifies a rank method, is used when running the bibrank daemon and used to name the configuration file for the method. <br />The template files includes the necessary parameters for the chosen rank method, and only needs to be edited with the correct tags and paths. <br />For more information, please go to the <a title="See guide" href="%s/help/admin/bibrank-admin-guide">BibRank guide</a> and read the section about adding a rank method</dd> </dl> """ % CFG_SITE_URL text = """ <span class="adminlabel">BibRank code</span> <input class="admin_wvar" type="text" name="rnkcode" value="%s" /> """ % (rnkcode) text += """<br /> <span class="adminlabel">Cfg template</span> <select name="template" class="admin_w200"> <option value="">No template</option> """ templates = get_templates() for templ in templates: text += """<option value="%s" %s>%s</option>""" % (templ, template == templ and 'selected="selected"' or '', templ[9:len(templ)-4]) text += """</select>""" output += createhiddenform(action="addrankarea", text=text, button="Add rank method", ln=ln, confirm=1) if rnkcode: if confirm in ["0", 0]: subtitle = 'Step 2 - Confirm addition of rank method' text = """<b>Add rank method with BibRank code: '%s'.</b>""" % (rnkcode) if template: text += """<br /><b>Using configuration template: '%s'.</b>""" % (template) else: text += """<br /><b>Create empty configuration file.</b>""" output += createhiddenform(action="addrankarea", text=text, rnkcode=rnkcode, button="Confirm", template=template, confirm=1) elif confirm in ["1", 1]: rnkID = add_rnk(rnkcode) subtitle = "Step 3 - Result" if rnkID[0] == 1: rnkID = rnkID[1] text = """<b><span class="info">Added new rank method with BibRank code '%s'</span></b>""" % rnkcode try: if template: infile = open("%s/bibrank/%s" % (CFG_ETCDIR, template), 'r') indata = infile.readlines() infile.close() else: indata = () file = open("%s/bibrank/%s.cfg" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0]), 'w') for line in indata: file.write(line) file.close() if template: text += """<b><span class="info"><br />Configuration file created using '%s' as template.</span></b>""" % template else: text += """<b><span class="info"><br />Empty configuration file created.</span></b>""" except StandardError, e: text += """<b><span class="info"><br />Sorry, could not create configuration file: '%s/bibrank/%s.cfg', either because it already exists, or not enough rights to create file. <br />Please create the file in the path given.</span></b>""" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0]) else: text = """<b><span class="info">Sorry, could not add rank method, rank method with the same BibRank code probably exists.</span></b>""" output += text elif not rnkcode and confirm not in [-1, "-1"]: output += """<b><span class="info">Sorry, could not add rank method, not enough data submitted.</span></b>""" body = [output] return addadminbox(subtitle + """ <small>[<a title="See guide" href="%s/help/admin/bibrank-admin-guide#ar">?</a>]</small>""" % CFG_SITE_URL, body) def perform_modifyrank(rnkID, rnkcode='', ln=CFG_SITE_LANG, template='', cfgfile='', confirm=0): """form to modify a rank method rnkID - id of the rank method """ if not rnkID: return "No ranking method selected." if not get_rnk_code(rnkID): return "Ranking method %s does not seem to exist." % str(rnkID) subtitle = 'Step 1 - Please modify the wanted values below' if not rnkcode: oldcode = get_rnk_code(rnkID)[0] else: oldcode = rnkcode output = """ <dl> <dd>When changing the BibRank code of a rank method, you must also change any scheduled tasks using the old value. <br />For more information, please go to the <a title="See guide" href="%s/help/admin/bibrank-admin-guide">BibRank guide</a> and read the section about modifying a rank method's BibRank code.</dd> </dl> """ % CFG_SITE_URL text = """ <span class="adminlabel">BibRank code</span> <input class="admin_wvar" type="text" name="rnkcode" value="%s" /> <br /> """ % (oldcode) try: text += """<span class="adminlabel">Cfg file</span>""" textarea = "" if cfgfile: textarea +=cfgfile else: file = open("%s/bibrank/%s.cfg" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0])) for line in file.readlines(): textarea += line text += """<textarea class="admin_wvar" name="cfgfile" rows="15" cols="70">""" + textarea + """</textarea>""" except StandardError, e: text += """<b><span class="info">Cannot load file, either it does not exist, or not enough rights to read it: '%s/bibrank/%s.cfg'<br />Please create the file in the path given.</span></b>""" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0]) output += createhiddenform(action="modifyrank", text=text, rnkID=rnkID, button="Modify", confirm=1) if rnkcode and confirm in ["1", 1] and get_rnk_code(rnkID)[0][0] != rnkcode: oldcode = get_rnk_code(rnkID)[0][0] result = modify_rnk(rnkID, rnkcode) subtitle = "Step 3 - Result" if result: text = """<b><span class="info">Rank method modified.</span></b>""" try: file = open("%s/bibrank/%s.cfg" % (CFG_ETCDIR, oldcode), 'r') file2 = open("%s/bibrank/%s.cfg" % (CFG_ETCDIR, rnkcode), 'w') lines = file.readlines() for line in lines: file2.write(line) file.close() file2.close() os.remove("%s/bibrank/%s.cfg" % (CFG_ETCDIR, oldcode)) except StandardError, e: text = """<b><span class="info">Sorry, could not change name of cfg file, must be done manually: '%s/bibrank/%s.cfg'</span></b>""" % (CFG_ETCDIR, oldcode) else: text = """<b><span class="info">Sorry, could not modify rank method.</span></b>""" output += text if cfgfile and confirm in ["1", 1]: try: file = open("%s/bibrank/%s.cfg" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0]), 'w') file.write(cfgfile) file.close() text = """<b><span class="info"><br />Configuration file modified: '%s/bibrank/%s.cfg'</span></b>""" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0]) except StandardError, e: text = """<b><span class="info"><br />Sorry, could not modify configuration file, please check for rights to do so: '%s/bibrank/%s.cfg'<br />Please modify the file manually.</span></b>""" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0]) output += text finoutput = addadminbox(subtitle + """ <small>[<a title="See guide" href="%s/help/admin/bibrank-admin-guide#mr">?</a>]</small>""" % CFG_SITE_URL, [output]) output = "" text = """ <span class="adminlabel">Select</span> <select name="template" class="admin_w200"> <option value="">- select template -</option> """ templates = get_templates() for templ in templates: text += """<option value="%s" %s>%s</option>""" % (templ, template == templ and 'selected="selected"' or '', templ[9:len(templ)-4]) text += """</select><br />""" output += createhiddenform(action="modifyrank", text=text, rnkID=rnkID, button="Show template", confirm=0) try: if template: textarea = "" text = """<span class="adminlabel">Content:</span>""" file = open("%s/bibrank/%s" % (CFG_ETCDIR, template), 'r') lines = file.readlines() for line in lines: textarea += line file.close() text += """<textarea class="admin_wvar" readonly="true" rows="15" cols="70">""" + textarea + """</textarea>""" output += text except StandardError, e: output += """Cannot load file, either it does not exist, or not enough rights to read it: '%s/bibrank/%s'""" % (CFG_ETCDIR, template) finoutput += addadminbox("View templates", [output]) return finoutput def perform_deleterank(rnkID, ln=CFG_SITE_LANG, confirm=0): """form to delete a rank method """ subtitle ='' output = """ <span class="warning"> <dl> <dt><strong>WARNING:</strong></dt> <dd><strong>When deleting a rank method, you also deletes all data related to the rank method, like translations, which collections it was attached to and the data necessary to rank the searchresults. Any scheduled tasks using the deleted rank method will also stop working. <br /><br />For more information, please go to the <a title="See guide" href="%s/help/admin/bibrank-admin-guide">BibRank guide</a> and read the section regarding deleting a rank method.</strong></dd> </dl> </span> """ % CFG_SITE_URL if rnkID: if confirm in ["0", 0]: rnkNAME = get_def_name(rnkID, "rnkMETHOD")[0][1] subtitle = 'Step 1 - Confirm deletion' text = """Delete rank method '%s'.""" % (rnkNAME) output += createhiddenform(action="deleterank", text=text, button="Confirm", rnkID=rnkID, confirm=1) elif confirm in ["1", 1]: try: rnkNAME = get_def_name(rnkID, "rnkMETHOD")[0][1] rnkcode = get_rnk_code(rnkID)[0][0] table = "" try: config = ConfigParser.ConfigParser() config.readfp(open("%s/bibrank/%s.cfg" % (CFG_ETCDIR, rnkcode), 'r')) table = config.get(config.get('rank_method', "function"), "table") except Exception: pass result = delete_rnk(rnkID, table) subtitle = "Step 2 - Result" if result: text = """<b><span class="info">Rank method deleted</span></b>""" try: os.remove("%s/bibrank/%s.cfg" % (CFG_ETCDIR, rnkcode)) text += """<br /><b><span class="info">Configuration file deleted: '%s/bibrank/%s.cfg'.</span></b>""" % (CFG_ETCDIR, rnkcode) except StandardError, e: text += """<br /><b><span class="info">Sorry, could not delete configuration file: '%s/bibrank/%s.cfg'.</span><br />Please delete the file manually.</span></b>""" % (CFG_ETCDIR, rnkcode) else: text = """<b><span class="info">Sorry, could not delete rank method</span></b>""" except StandardError, e: text = """<b><span class="info">Sorry, could not delete rank method, most likely already deleted</span></b>""" output = text body = [output] return addadminbox(subtitle + """ <small>[<a title="See guide" href="%s/help/admin/bibrank-admin-guide#dr">?</a>]</small>""" % CFG_SITE_URL, body) def perform_showrankdetails(rnkID, ln=CFG_SITE_LANG): """Returns details about the rank method given by rnkID""" if not rnkID: return "No ranking method selected." if not get_rnk_code(rnkID): return "Ranking method %s does not seem to exist." % str(rnkID) subtitle = """Overview <a href="%s/admin/bibrank/bibrankadmin.py/modifyrank?rnkID=%s&ln=%s">[Modify]</a>""" % (CFG_SITE_URL, rnkID, ln) text = """ BibRank code: %s<br /> Last updated by BibRank: """ % (get_rnk_code(rnkID)[0][0]) if get_rnk(rnkID)[0][2]: text += "%s<br />" % get_rnk(rnkID)[0][2] else: text += "Not yet run.<br />" output = addadminbox(subtitle, [text]) subtitle = """Rank method statistics""" text = "" try: text = "Not yet implemented" except StandardError, e: text = "BibRank not yet run, cannot show statistics for method" output += addadminbox(subtitle, [text]) subtitle = """Attached to collections <a href="%s/admin/bibrank/bibrankadmin.py/modifycollection?rnkID=%s&ln=%s">[Modify]</a>""" % (CFG_SITE_URL, rnkID, ln) text = "" col = get_rnk_col(rnkID, ln) for key, value in col: text+= "%s<br />" % value if not col: text +="No collections" output += addadminbox(subtitle, [text]) subtitle = """Translations <a href="%s/admin/bibrank/bibrankadmin.py/modifytranslations?rnkID=%s&ln=%s">[Modify]</a>""" % (CFG_SITE_URL, rnkID, ln) prev_lang = '' trans = get_translations(rnkID) types = get_rnk_nametypes() types = dict(map(lambda x: (x[0], x[1]), types)) text = "" languages = dict(get_languages()) if trans: for lang, type, name in trans: if lang and languages.has_key(lang) and type and name: if prev_lang != lang: prev_lang = lang text += """%s: <br />""" % (languages[lang]) if types.has_key(type): text+= """<span style="margin-left: 10px">'%s'</span><span class="note">(%s)</span><br />""" % (name, types[type]) else: text = """No translations exists""" output += addadminbox(subtitle, [text]) subtitle = """Configuration file: '%s/bibrank/%s.cfg' <a href="%s/admin/bibrank/bibrankadmin.py/modifyrank?rnkID=%s&ln=%s">[Modify]</a>""" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0], CFG_SITE_URL, rnkID, ln) text = "" try: file = open("%s/bibrank/%s.cfg" % (CFG_ETCDIR, get_rnk_code(rnkID)[0][0])) text += """<pre>""" for line in file.readlines(): text += line text += """</pre>""" except StandardError, e: text = """Cannot load file, either it does not exist, or not enough rights to read it.""" output += addadminbox(subtitle, [text]) return output def compare_on_val(second, first): return cmp(second[1], first[1]) def get_rnk_code(rnkID): """Returns the name from rnkMETHOD based on argument rnkID - id from rnkMETHOD""" try: res = run_sql("SELECT name FROM rnkMETHOD where id=%s" % (rnkID)) return res except StandardError, e: return () def get_rnk(rnkID=''): """Return one or all rank methods rnkID - return the rank method given, or all if not given""" try: if rnkID: res = run_sql("SELECT id,name,DATE_FORMAT(last_updated, '%%Y-%%m-%%d %%H:%%i:%%s') from rnkMETHOD WHERE id=%s" % rnkID) else: res = run_sql("SELECT id,name,DATE_FORMAT(last_updated, '%%Y-%%m-%%d %%H:%%i:%%s') from rnkMETHOD") return res except StandardError, e: return () def get_translations(rnkID): """Returns the translations in rnkMETHODNAME for a rankmethod rnkID - the id of the rankmethod from rnkMETHOD """ try: res = run_sql("SELECT ln, type, value FROM rnkMETHODNAME where id_rnkMETHOD=%s ORDER BY ln,type" % (rnkID)) return res except StandardError, e: return () def get_rnk_nametypes(): """Return a list of the various translationnames for the rank methods""" type = [] type.append(('ln', 'Long name')) #type.append(('sn', 'Short name')) return type def get_col_nametypes(): """Return a list of the various translationnames for the rank methods""" type = [] type.append(('ln', 'Long name')) return type def get_rnk_col(rnkID, ln=CFG_SITE_LANG): """ Returns a list of the collections the given rank method is attached to rnkID - id from rnkMETHOD""" try: res1 = dict(run_sql("SELECT id_collection, '' FROM collection_rnkMETHOD WHERE id_rnkMETHOD=%s" % rnkID)) res2 = get_def_name('', "collection") result = filter(lambda x: res1.has_key(x[0]), res2) return result except StandardError, e: return () def get_templates(): """Read CFG_ETCDIR/bibrank and returns a list of all files with 'template' """ templates = [] files = os.listdir(CFG_ETCDIR + "/bibrank/") for file in files: if str.find(file,"template_") != -1: templates.append(file) return templates def attach_col_rnk(rnkID, colID): """attach rank method to collection rnkID - id from rnkMETHOD table colID - id of collection, as in collection table """ try: res = run_sql("INSERT INTO collection_rnkMETHOD(id_collection, id_rnkMETHOD) values (%s,%s)" % (colID, rnkID)) return (1, "") except StandardError, e: return (0, e) def detach_col_rnk(rnkID, colID): """detach rank method from collection rnkID - id from rnkMETHOD table colID - id of collection, as in collection table """ try: res = run_sql("DELETE FROM collection_rnkMETHOD WHERE id_collection=%s AND id_rnkMETHOD=%s" % (colID, rnkID)) return (1, "") except StandardError, e: return (0, e) def delete_rnk(rnkID, table=""): """Deletes all data for the given rank method rnkID - delete all data in the tables associated with ranking and this id """ try: res = run_sql("DELETE FROM rnkMETHOD WHERE id=%s" % rnkID) res = run_sql("DELETE FROM rnkMETHODNAME WHERE id_rnkMETHOD=%s" % rnkID) res = run_sql("DELETE FROM collection_rnkMETHOD WHERE id_rnkMETHOD=%s" % rnkID) res = run_sql("DELETE FROM rnkMETHODDATA WHERE id_rnkMETHOD=%s" % rnkID) if table: res = run_sql("truncate %s" % table) res = run_sql("truncate %sR" % table[:-1]) return (1, "") except StandardError, e: return (0, e) def modify_rnk(rnkID, rnkcode): """change the code for the rank method given rnkID - change in rnkMETHOD where id is like this rnkcode - new value for field 'name' in rnkMETHOD """ try: res = run_sql("UPDATE rnkMETHOD set name=%s WHERE id=%s", (rnkcode, rnkID)) return (1, "") except StandardError, e: return (0, e) def add_rnk(rnkcode): """Adds a new rank method to rnkMETHOD rnkcode - the "code" for the rank method, to be used by bibrank daemon """ try: res = run_sql("INSERT INTO rnkMETHOD (name) VALUES (%s)", (rnkcode,)) res = run_sql("SELECT id FROM rnkMETHOD WHERE name=%s", (rnkcode,)) if res: return (1, res[0][0]) else: raise StandardError except StandardError, e: return (0, e) def addadminbox(header='', datalist=[], cls="admin_wvar"): """used to create table around main data on a page, row based. header - header on top of the table datalist - list of the data to be added row by row cls - possible to select wich css-class to format the look of the table.""" if len(datalist) == 1: per = '100' else: per = '75' output = '<table class="%s" ' % (cls, ) + 'width="95%">\n' output += """ <thead> <tr> <th class="adminheaderleft" colspan="%s">%s</th> </tr> </thead> <tbody> """ % (len(datalist), header) output += ' <tr>\n' output += """ <td style="vertical-align: top; margin-top: 5px; width: %s;"> %s </td> """ % (per+'%', datalist[0]) if len(datalist) > 1: output += """ <td style="vertical-align: top; margin-top: 5px; width: %s;"> %s </td> """ % ('25%', datalist[1]) output += ' </tr>\n' output += """ </tbody> </table> """ return output def tupletotable(header=[], tuple=[], start='', end='', extracolumn=''): """create html table for a tuple. header - optional header for the columns tuple - create table of this start - text to be added in the beginning, most likely beginning of a form end - text to be added in the end, mot likely end of a form. extracolumn - mainly used to put in a button. """ # study first row in tuple for alignment align = [] try: firstrow = tuple[0] if type(firstrow) in [int, long]: align = ['admintdright'] elif type(firstrow) in [str, dict]: align = ['admintdleft'] else: for item in firstrow: if type(item) is int: align.append('admintdright') else: align.append('admintdleft') except IndexError: firstrow = [] tblstr = '' for h in header + ['']: tblstr += ' <th class="adminheader">%s</th>\n' % (h, ) if tblstr: tblstr = ' <tr>\n%s\n </tr>\n' % (tblstr, ) tblstr = start + '<table class="admin_wvar_nomargin">\n' + tblstr # extra column try: extra = '<tr>' if type(firstrow) not in [int, long, str, dict]: # for data in firstrow: extra += '<td class="%s">%s</td>\n' % ('admintd', data) for i in range(len(firstrow)): extra += '<td class="%s">%s</td>\n' % (align[i], firstrow[i]) else: extra += ' <td class="%s">%s</td>\n' % (align[0], firstrow) extra += '<td rowspan="%s" style="vertical-align: top">\n%s\n</td>\n</tr>\n' % (len(tuple), extracolumn) except IndexError: extra = '' tblstr += extra # for i in range(1, len(tuple)): for row in tuple[1:]: tblstr += ' <tr>\n' # row = tuple[i] if type(row) not in [int, long, str, dict]: # for data in row: tblstr += '<td class="admintd">%s</td>\n' % (data,) for i in range(len(row)): tblstr += '<td class="%s">%s</td>\n' % (align[i], row[i]) else: tblstr += ' <td class="%s">%s</td>\n' % (align[0], row) tblstr += ' </tr> \n' tblstr += '</table> \n ' tblstr += end return tblstr def tupletotable_onlyselected(header=[], tuple=[], selected=[], start='', end='', extracolumn=''): """create html table for a tuple. header - optional header for the columns tuple - create table of this selected - indexes of selected rows in the tuple start - put this in the beginning end - put this in the beginning extracolumn - mainly used to put in a button""" tuple2 = [] for index in selected: tuple2.append(tuple[int(index)-1]) return tupletotable(header=header, tuple=tuple2, start=start, end=end, extracolumn=extracolumn) def addcheckboxes(datalist=[], name='authids', startindex=1, checked=[]): """adds checkboxes in front of the listdata. datalist - add checkboxes in front of this list name - name of all the checkboxes, values will be associated with this name startindex - usually 1 because of the header checked - values of checkboxes to be pre-checked """ if not type(checked) is list: checked = [checked] for row in datalist: if 1 or row[0] not in [-1, "-1", 0, "0"]: # always box, check another place chkstr = str(startindex) in checked and 'checked="checked"' or '' row.insert(0, '<input type="checkbox" name="%s" value="%s" %s />' % (name, startindex, chkstr)) else: row.insert(0, '') startindex += 1 return datalist def createhiddenform(action="", text="", button="confirm", cnfrm='', **hidden): """create select with hidden values and submit button action - name of the action to perform on submit text - additional text, can also be used to add non hidden input button - value/caption on the submit button cnfrm - if given, must check checkbox to confirm **hidden - dictionary with name=value pairs for hidden input """ output = '<form action="%s" method="post">\n' % (action, ) output += '<table>\n<tr><td style="vertical-align: top">' output += text if cnfrm: output += ' <input type="checkbox" name="confirm" value="1"/>' for key in hidden.keys(): if type(hidden[key]) is list: for value in hidden[key]: output += ' <input type="hidden" name="%s" value="%s"/>\n' % (key, value) else: output += ' <input type="hidden" name="%s" value="%s"/>\n' % (key, hidden[key]) output += '</td><td style="vertical-align: bottom">' output += ' <input class="adminbutton" type="submit" value="%s"/>\n' % (button, ) output += '</td></tr></table>' output += '</form>\n' return output def get_languages(): languages = [] for (lang, lang_namelong) in language_list_long(): languages.append((lang, lang_namelong)) languages.sort() return languages def get_def_name(ID, table): """Returns a list of the names, either with the name in the current language, the default language, or just the name from the given table ln - a language supported by CDS Invenio type - the type of value wanted, like 'ln', 'sn'""" name = "name" if table[-1:].isupper(): name = "NAME" try: if ID: res = run_sql("SELECT id,name FROM %s where id=%s" % (table, ID)) else: res = run_sql("SELECT id,name FROM %s" % table) res = list(res) res.sort(compare_on_val) return res except StandardError, e: return [] def get_i8n_name(ID, ln, rtype, table): """Returns a list of the names, either with the name in the current language, the default language, or just the name from the given table ln - a language supported by CDS Invenio type - the type of value wanted, like 'ln', 'sn'""" name = "name" if table[-1:].isupper(): name = "NAME" try: res = "" if ID: res = run_sql("SELECT id_%s,value FROM %s%s where type='%s' and ln='%s' and id_%s=%s" % (table, table, name, rtype,ln, table, ID)) else: res = run_sql("SELECT id_%s,value FROM %s%s where type='%s' and ln='%s'" % (table, table, name, rtype,ln)) if ln != CFG_SITE_LANG: if ID: res1 = run_sql("SELECT id_%s,value FROM %s%s WHERE ln='%s' and type='%s' and id_%s=%s" % (table, table, name, CFG_SITE_LANG, rtype, table, ID)) else: res1 = run_sql("SELECT id_%s,value FROM %s%s WHERE ln='%s' and type='%s'" % (table, table, name, CFG_SITE_LANG, rtype)) res2 = dict(res) result = filter(lambda x: not res2.has_key(x[0]), res1) res = res + result if ID: res1 = run_sql("SELECT id,name FROM %s where id=%s" % (table, ID)) else: res1 = run_sql("SELECT id,name FROM %s" % table) res2 = dict(res) result = filter(lambda x: not res2.has_key(x[0]), res1) res = res + result res = list(res) res.sort(compare_on_val) return res except StandardError, e: raise StandardError def get_name(ID, ln, rtype, table): """Returns the value from the table name based on arguments ID - id ln - a language supported by CDS Invenio type - the type of value wanted, like 'ln', 'sn' table - tablename""" name = "name" if table[-1:].isupper(): name = "NAME" try: res = run_sql("SELECT value FROM %s%s WHERE type='%s' and ln='%s' and id_%s=%s" % (table, name, rtype, ln, table, ID)) return res except StandardError, e: return () def modify_translations(ID, langs, sel_type, trans, table): """add or modify translations in tables given by table frmID - the id of the format from the format table sel_type - the name type langs - the languages trans - the translations, in same order as in langs table - the table""" name = "name" if table[-1:].isupper(): name = "NAME" try: for nr in range(0,len(langs)): res = run_sql("SELECT value FROM %s%s WHERE id_%s=%%s AND type=%%s AND ln=%%s" % (table, name, table), (ID, sel_type, langs[nr][0])) if res: if trans[nr]: res = run_sql("UPDATE %s%s SET value=%%s WHERE id_%s=%%s AND type=%%s AND ln=%%s" % (table, name, table), (trans[nr], ID, sel_type, langs[nr][0])) else: res = run_sql("DELETE FROM %s%s WHERE id_%s=%%s AND type=%%s AND ln=%%s" % (table, name, table), (ID, sel_type, langs[nr][0])) else: if trans[nr]: res = run_sql("INSERT INTO %s%s (id_%s, type, ln, value) VALUES (%%s,%%s,%%s,%%s)" % (table, name, table), (ID, sel_type, langs[nr][0], trans[nr])) return (1, "") except StandardError, e: return (0, e) def write_outcome(res): """ Write the outcome of an update of some settings. Parameter 'res' is a tuple (int, str), where 'int' is 0 when there is an error to display, and 1 when everything went fine. 'str' is a message displayed when there is an error. """ if res and res[0] == 1: return """<b><span class="info">Operation successfully completed.</span></b>""" elif res: return """<b><span class="info">Operation failed. Reason:</span></b><br />%s""" % res[1] diff --git a/modules/miscutil/lib/errorlib.py b/modules/miscutil/lib/errorlib.py index ebe459b64..ece38ef32 100644 --- a/modules/miscutil/lib/errorlib.py +++ b/modules/miscutil/lib/errorlib.py @@ -1,522 +1,522 @@ # -*- coding: utf-8 -*- ## ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Error handling library """ __revision__ = "$Id$" import traceback import os import sys import time from cStringIO import StringIO from invenio.config import CFG_SITE_LANG, CFG_LOGDIR, CFG_WEBALERT_ALERT_ENGINE_EMAIL, CFG_SITE_ADMIN_EMAIL, CFG_SITE_SUPPORT_EMAIL, CFG_SITE_NAME, CFG_SITE_URL, CFG_VERSION, CFG_CERN_SITE, CFG_SITE_EMERGENCY_PHONE_NUMBERS from invenio.miscutil_config import CFG_MISCUTIL_ERROR_MESSAGES from invenio.urlutils import wash_url_argument from invenio.messages import wash_language, gettext_set_language from invenio.dateutils import convert_datestruct_to_datetext def get_client_info(req): """ Returns a dictionary with client information @param req: mod_python request """ try: return \ { 'host' : req.hostname, 'url' : req.unparsed_uri, 'time' : convert_datestruct_to_datetext(time.localtime()), 'browser' : req.headers_in.has_key('User-Agent') and req.headers_in['User-Agent'] or "N/A", - 'client_ip' : req.connection.remote_ip + 'client_ip' : req.remote_ip } except: return {} def get_pretty_wide_client_info(req): """Return in a pretty way all the avilable information about the current user/client""" if req: from invenio.webuser import collect_user_info user_info = collect_user_info(req) keys = user_info.keys() keys.sort() max_key = max([len(key) for key in keys]) ret = "" fmt = "%% %is: %%s\n" % max_key for key in keys: if key in ('uri', 'referer'): ret += fmt % (key, "<%s>" % user_info[key]) else: ret += fmt % (key, user_info[key]) if ret.endswith('\n'): return ret[:-1] else: return ret else: return "No client information available" def get_tracestack(): """ If an exception has been caught, return the system tracestack or else return tracestack of what is currently in the stack """ if traceback.format_tb(sys.exc_info()[2]): delimiter = "\n" tracestack_pretty = "Traceback: \n%s" % delimiter.join(traceback.format_tb(sys.exc_info()[2])) else: tracestack = traceback.extract_stack()[:-1] #force traceback except for this call tracestack_pretty = "%sForced traceback (most recent call last)" % (' '*4,) for trace_tuple in tracestack: tracestack_pretty += """ File "%(file)s", line %(line)s, in %(function)s %(text)s""" % \ { 'file' : trace_tuple[0], 'line' : trace_tuple[1], 'function' : trace_tuple[2], 'text' : trace_tuple[3] is not None and str(trace_tuple[3]) or "" } return tracestack_pretty def send_sms(phone_number, msg): """Send msg as an SMS to each phone number. Note: this function is just an example and works only at CERN it should be reimplemented in your own instituition. """ if not CFG_CERN_SITE: raise NotImplementedError, "Implement this function with your own method" if phone_number[0] == '+': phone_number = '00' + phone_number[1:] if phone_number[0] != '0': phone_number = '00' + phone_number from invenio.mailutils import send_email return send_email(CFG_SITE_SUPPORT_EMAIL, phone_number + '@sms.switch.ch', '', msg, header='', footer='') def register_emergency(msg, send_sms_function=send_sms): """Launch an emergency. This means to send sms messages to each phone number in CFG_SITE_EMERGENCY_PHONE_NUMBERS.""" for phone_number in CFG_SITE_EMERGENCY_PHONE_NUMBERS: send_sms_function(phone_number, msg) def register_exception(force_stack=False, stream='error', req=None, prefix='', suffix='', alert_admin=False, subject=''): """ Log error exception to invenio.err and warning exception to invenio.log. Errors will be logged together with client information (if req is given). Note: For sanity reasons, dynamic params such as PREFIX, SUFFIX and local stack variables are checked for length, and only first 500 chars of their values are printed. @param force_stack: when True stack is always printed, while when False, stack is printed only whenever the Exception type is not containing the word Invenio @param stream: 'error' or 'warning' @param req: mod_python request @param prefix: a message to be printed before the exception in the log @param suffix: a message to be printed before the exception in the log @param alert_admin: wethever to send the exception to the administrator via email @param subject: overrides the email subject @return: 1 if successfully wrote to stream, 0 if not """ def _truncate_dynamic_string(val, maxlength=500): """ Return at most MAXLENGTH characters of VAL. Useful for sanitizing dynamic variable values in the output. """ out = str(val) if len(out) > maxlength: out = out[:maxlength] + ' [...]' return out def _get_filename_and_line(exc_info): """ Return the filename and the line where the exception happened. """ tb = exc_info[2] exception_info = traceback.extract_tb(tb, 1)[0] filename = os.path.basename(exception_info[0]) line_no = exception_info[1] return filename, line_no try: ## Let's extract exception information exc_info = sys.exc_info() if exc_info[0]: ## We found an exception. ## We want to extract the name of the Exception exc_name = exc_info[0].__name__ exc_value = str(exc_info[1]) ## Let's record when and where and what www_data = "%(time)s -> %(name)s: %(value)s" % { 'time' : time.strftime("%Y-%m-%d %H:%M:%S"), 'name' : exc_name, 'value' : exc_value } ## Let's retrieve contextual user related info, if any try: client_data = get_pretty_wide_client_info(req) except Exception, e: client_data = "Error in retrieving contextual information: %s" % e ## Let's extract the traceback: if not exc_name.startswith('Invenio') or force_stack: ## We put a large traceback only if requested ## or the Exception is not an Invenio one. tracestack = traceback.extract_stack()[-5:-2] tracestack_data = "Forced traceback (most recent call last):" for trace_tuple in tracestack: tracestack_data += """ File "%(file)s", line %(line)s, in %(function)s %(text)s""" % \ { 'file' : trace_tuple[0], 'line' : trace_tuple[1], 'function' : trace_tuple[2], 'text' : trace_tuple[3] is not None and str(trace_tuple[3]) or "" } else: tracestack_data = "" ## Let's get the exception (and the traceback): exception_data = StringIO() traceback.print_exception(exc_info[0], exc_info[1], exc_info[2], None, exception_data) exception_data = exception_data.getvalue() if exception_data.endswith('\n'): exception_data = exception_data[:-1] ## Let's get the values of local variables on the stack: localvars_data = "\nLocal variables:\n" localvars = sys.exc_info()[2].tb_frame.f_locals.keys() localvars.sort() for localvar in localvars: localvars_data += " %s = %s\n" % \ (localvar, _truncate_dynamic_string(repr(sys.exc_info()[2].tb_frame.f_locals[localvar]))) if localvars_data.endswith('\n'): localvars_data = localvars_data[:-1] ## Okay, start printing: log_stream = StringIO() email_stream = StringIO() print >> email_stream, '\n', ## If a prefix was requested let's print it if prefix: prefix = _truncate_dynamic_string(prefix) print >> log_stream, prefix + '\n' print >> email_stream, prefix + '\n' print >> email_stream, "The following problem occurred on <%s> (CDS Invenio %s)" % (CFG_SITE_URL, CFG_VERSION) print >> email_stream, "\n>>> Registered exception\n" print >> log_stream, www_data print >> email_stream, www_data print >> email_stream, "\n>>> User details\n" print >> log_stream, client_data print >> email_stream, client_data print >> email_stream, "\n>>> Traceback details\n" if tracestack_data: print >> log_stream, tracestack_data print >> email_stream, tracestack_data print >> log_stream, exception_data print >> email_stream, exception_data print >> log_stream, localvars_data print >> email_stream, localvars_data ## If a suffix was requested let's print it if suffix: suffix = _truncate_dynamic_string(suffix) print >> log_stream, suffix print >> email_stream, suffix log_text = log_stream.getvalue() email_text = email_stream.getvalue() if email_text.endswith('\n'): email_text = email_text[:-1] ## Preparing the exception dump stream = stream=='error' and 'err' or 'log' ## We now have the whole trace written_to_log = False try: ## Let's try to write into the log. open(os.path.join(CFG_LOGDIR, 'invenio.' + stream), 'a').write(log_text) written_to_log = True finally: if alert_admin or not written_to_log: ## If requested or if it's impossible to write in the log from invenio.mailutils import send_email if not subject: filename, line_no = _get_filename_and_line(exc_info) subject = 'Exception (%s:%s)' % (filename, line_no) subject = '%s at %s' % (subject, CFG_SITE_URL) send_email(CFG_SITE_ADMIN_EMAIL, CFG_SITE_ADMIN_EMAIL, subject=subject, content=email_text) return 1 else: return 0 except Exception, e: print >> sys.stderr, "Error in registering exception to '%s': '%s'" % (CFG_LOGDIR + '/invenio.' + stream, e) return 0 def register_errors(errors_or_warnings_list, stream, req=None): """ log errors to invenio.err and warnings to invenio.log errors will be logged with client information (if req is given) and a tracestack warnings will be logged with just the warning message @param errors_or_warnings_list: list of tuples (err_name, err_msg) err_name = ERR_ + %(module_directory_name)s + _ + %(error_name)s #ALL CAPS err_name must be stored in file: module_directory_name + _config.py as the key for dict with name: CFG_ + %(module_directory_name)s + _ERROR_MESSAGES @param stream: 'error' or 'warning' @param req: mod_python request @return: tuple integer 1 if successfully wrote to stream, integer 0 if not will append another error to errors_list if unsuccessful """ client_info_dict = "" if stream == "error": # call the stack trace now tracestack_pretty = get_tracestack() # if req is given, get client info if req: client_info_dict = get_client_info(req) if client_info_dict: client_info = \ '''URL: http://%(host)s%(url)s Browser: %(browser)s Client: %(client_ip)s''' % client_info_dict else: client_info = "No client information available" else: client_info = "No client information available" # check arguments errors_or_warnings_list = wash_url_argument(errors_or_warnings_list, 'list') stream = wash_url_argument(stream, 'str') for etuple in errors_or_warnings_list: etuple = wash_url_argument(etuple, 'tuple') # check stream arg for presence of [error,warning]; when none, add error and default to warning if stream == 'error': stream = 'err' elif stream == 'warning': stream = 'log' else: stream = 'log' error = 'ERR_MISCUTIL_BAD_FILE_ARGUMENT_PASSED' errors_or_warnings_list.append((error, eval(CFG_MISCUTIL_ERROR_MESSAGES[error])% stream)) # update log_errors stream_location = os.path.join(CFG_LOGDIR, 'invenio.' + stream) errors = '' for etuple in errors_or_warnings_list: try: errors += "%s%s : %s \n " % (' '*4*7+' ', etuple[0], etuple[1]) except: errors += "%s%s \n " % (' '*4*7+' ', etuple) if errors: errors = errors[(4*7+1):-3] # get rid of begining spaces and last '\n' msg = """ %(time)s --> %(errors)s%(error_file)s""" % \ { 'time' : client_info_dict and client_info_dict['time'] or time.strftime("%Y-%m-%d %H:%M:%S"), 'errors' : errors, 'error_file' : stream=='err' and "\n%s%s\n%s\n" % (' '*4, client_info, tracestack_pretty) or "" } try: stream_to_write = open(stream_location, 'a+') stream_to_write.writelines(msg) stream_to_write.close() return_value = 1 except : error = 'ERR_MISCUTIL_WRITE_FAILED' errors_or_warnings_list.append((error, CFG_MISCUTIL_ERROR_MESSAGES[error] % stream_location)) return_value = 0 return return_value def get_msg_associated_to_code(err_code, stream='error'): """ Returns string of code @param code: error or warning code @param stream: 'error' or 'warning' @return: tuple (err_code, formatted_message) """ err_code = wash_url_argument(err_code, 'str') stream = wash_url_argument(stream, 'str') try: module_directory_name = err_code.split('_')[1].lower() module_config = module_directory_name + '_config' module_dict_name = "CFG_" + module_directory_name.upper() + "_%s_MESSAGES" % stream.upper() module = __import__(module_config, globals(), locals(), [module_dict_name]) module_dict = getattr(module, module_dict_name) err_msg = module_dict[err_code] except ImportError: error = 'ERR_MISCUTIL_IMPORT_ERROR' err_msg = CFG_MISCUTIL_ERROR_MESSAGES[error] % (err_code, module_config) err_code = error except AttributeError: error = 'ERR_MISCUTIL_NO_DICT' err_msg = CFG_MISCUTIL_ERROR_MESSAGES[error] % (err_code, module_config, module_dict_name) err_code = error except KeyError: error = 'ERR_MISCUTIL_NO_MESSAGE_IN_DICT' err_msg = CFG_MISCUTIL_ERROR_MESSAGES[error] % (err_code, module_config + '.' + module_dict_name) err_code = error except: error = 'ERR_MISCUTIL_UNDEFINED_ERROR' err_msg = CFG_MISCUTIL_ERROR_MESSAGES[error] % err_code err_code = error return (err_code, err_msg) def get_msgs_for_code_list(code_list, stream='error', ln=CFG_SITE_LANG): """ @param code_list: list of tuples [(err_name, arg1, ..., argN), ...] err_name = ERR_ + %(module_directory_name)s + _ + %(error_name)s #ALL CAPS err_name must be stored in file: module_directory_name + _config.py as the key for dict with name: CFG_ + %(module_directory_name)s + _ERROR_MESSAGES For warnings, same thing except: err_name can begin with either 'ERR' or 'WRN' dict name ends with _warning_messages @param stream: 'error' or 'warning' @return: list of tuples of length 2 [('ERR_...', err_msg), ...] if code_list empty, will return None. if errors retrieving error messages, will append an error to the list """ ln = wash_language(ln) _ = gettext_set_language(ln) out = [] if type(code_list) is None: return None code_list = wash_url_argument(code_list, 'list') stream = wash_url_argument(stream, 'str') for code_tuple in code_list: if not(type(code_tuple) is tuple): code_tuple = (code_tuple,) nb_tuple_args = len(code_tuple) - 1 err_code = code_tuple[0] if stream == 'error' and not err_code.startswith('ERR'): error = 'ERR_MISCUTIL_NO_ERROR_MESSAGE' out.append((error, eval(CFG_MISCUTIL_ERROR_MESSAGES[error]))) continue elif stream == 'warning' and not (err_code.startswith('ERR') or err_code.startswith('WRN')): error = 'ERR_MISCUTIL_NO_WARNING_MESSAGE' out.append((error, eval(CFG_MISCUTIL_ERROR_MESSAGES[error]))) continue (new_err_code, err_msg) = get_msg_associated_to_code(err_code, stream) if err_msg[:2] == '_(' and err_msg[-1] == ')': # err_msg is internationalized err_msg = eval(err_msg) nb_msg_args = err_msg.count('%') - err_msg.count('%%') parsing_error = "" if new_err_code != err_code or nb_msg_args == 0: # undefined_error or immediately displayable error out.append((new_err_code, err_msg)) continue try: if nb_msg_args == nb_tuple_args: err_msg = err_msg % code_tuple[1:] elif nb_msg_args < nb_tuple_args: err_msg = err_msg % code_tuple[1:nb_msg_args+1] parsing_error = 'ERR_MISCUTIL_TOO_MANY_ARGUMENT' parsing_error_message = eval(CFG_MISCUTIL_ERROR_MESSAGES[parsing_error]) parsing_error_message %= code_tuple[0] elif nb_msg_args > nb_tuple_args: code_tuple = list(code_tuple) for dummy in range(nb_msg_args - nb_tuple_args): code_tuple.append('???') code_tuple = tuple(code_tuple) err_msg = err_msg % code_tuple[1:] parsing_error = 'ERR_MISCUTIL_TOO_FEW_ARGUMENT' parsing_error_message = eval(CFG_MISCUTIL_ERROR_MESSAGES[parsing_error]) parsing_error_message %= code_tuple[0] except: parsing_error = 'ERR_MISCUTIL_BAD_ARGUMENT_TYPE' parsing_error_message = eval(CFG_MISCUTIL_ERROR_MESSAGES[parsing_error]) parsing_error_message %= code_tuple[0] out.append((err_code, err_msg)) if parsing_error: out.append((parsing_error, parsing_error_message)) if not(out): out = None return out def send_error_report_to_admin(header, url, time_msg, browser, client, error, sys_error, traceback_msg): """ Sends an email to the admin with client info and tracestack """ from_addr = '%s Alert Engine <%s>' % (CFG_SITE_NAME, CFG_WEBALERT_ALERT_ENGINE_EMAIL) to_addr = CFG_SITE_ADMIN_EMAIL body = """ The following error was seen by a user and sent to you. %(contact)s %(header)s %(url)s %(time)s %(browser)s %(client)s %(error)s %(sys_error)s %(traceback)s Please see the %(logdir)s/invenio.err for traceback details.""" % \ { 'header' : header, 'url' : url, 'time' : time_msg, 'browser' : browser, 'client' : client, 'error' : error, 'sys_error' : sys_error, 'traceback' : traceback_msg, 'logdir' : CFG_LOGDIR, 'contact' : "Please contact %s quoting the following information:" % (CFG_SITE_SUPPORT_EMAIL,) } from invenio.mailutils import send_email send_email(from_addr, to_addr, subject="Error notification", content=body) diff --git a/modules/miscutil/lib/inveniocfg.py b/modules/miscutil/lib/inveniocfg.py index 5045bc08c..468cab156 100644 --- a/modules/miscutil/lib/inveniocfg.py +++ b/modules/miscutil/lib/inveniocfg.py @@ -1,1085 +1,1092 @@ # -*- coding: utf-8 -*- ## ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Invenio configuration and administration CLI tool. Usage: inveniocfg [options] General options: -h, --help print this help -V, --version print version number Options to finish your installation: --create-apache-conf create Apache configuration files --create-tables create DB tables for Invenio --load-webstat-conf load the WebStat configuration --drop-tables drop DB tables of Invenio Options to set up and test a demo site: --create-demo-site create demo site --load-demo-records load demo records --remove-demo-records remove demo records, keeping demo site --drop-demo-site drop demo site configurations too --run-unit-tests run unit test suite (needs deme site) --run-regression-tests run regression test suite (needs demo site) --run-web-tests run web tests in a browser (needs demo site, Firefox, Selenium IDE) Options to update config files in situ: --update-all perform all the update options --update-config-py update config.py file from invenio.conf file --update-dbquery-py update dbquery.py with DB credentials from invenio.conf --update-dbexec update dbexec with DB credentials from invenio.conf --update-bibconvert-tpl update bibconvert templates with CFG_SITE_URL from invenio.conf --update-web-tests update web test cases with CFG_SITE_URL from invenio.conf Options to update DB tables: --reset-all perform all the reset options --reset-sitename reset tables to take account of new CFG_SITE_NAME* --reset-siteadminemail reset tables to take account of new CFG_SITE_ADMIN_EMAIL --reset-fieldnames reset tables to take account of new I18N names from PO files --reset-recstruct-cache reset record structure cache according to CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE Options to help the work: --list print names and values of all options from conf files --get <some-opt> get value of a given option from conf files --conf-dir </some/path> path to directory where invenio*.conf files are [optional] --detect-system-details print system details such as Apache/Python/MySQL versions """ __revision__ = "$Id$" from ConfigParser import ConfigParser import os import re import shutil import socket import sys import zlib import marshal def print_usage(): """Print help.""" print __doc__ def print_version(): """Print version information.""" print __revision__ def convert_conf_option(option_name, option_value): """ Convert conf option into Python config.py line, converting values to ints or strings as appropriate. """ ## 1) convert option name to uppercase: option_name = option_name.upper() ## 2) convert option value to int or string: if option_name in ['CFG_BIBUPLOAD_REFERENCE_TAG', 'CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG', 'CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG', 'CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG', 'CFG_BIBUPLOAD_STRONG_TAGS', 'CFG_SITE_EMERGENCY_PHONE_NUMBERS']: # some options are supposed be string even when they look like # numeric option_value = '"' + option_value + '"' else: try: option_value = int(option_value) except ValueError: option_value = '"' + option_value + '"' ## 3a) special cases: regexps if option_name in ['CFG_BIBINDEX_CHARS_ALPHANUMERIC_SEPARATORS', 'CFG_BIBINDEX_CHARS_PUNCTUATION']: option_value = 'r"[' + option_value[1:-1] + ']"' ## 3b) special cases: True, False, None if option_value in ['"True"', '"False"', '"None"']: option_value = option_value[1:-1] ## 3c) special cases: dicts if option_name in ['CFG_WEBSEARCH_FIELDS_CONVERT', ]: option_value = option_value[1:-1] ## 3d) special cases: comma-separated lists if option_name in ['CFG_SITE_LANGS', 'CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS', 'CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS', 'CFG_BIBUPLOAD_STRONG_TAGS', 'CFG_BIBSCHED_GC_TASKS_TO_REMOVE', 'CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE', 'CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS', 'CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS', 'CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES', 'CFG_SITE_EMERGENCY_PHONE_NUMBERS']: out = "[" for elem in option_value[1:-1].split(","): if elem: if option_name in ['CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES']: # 3d1) integer values out += "%i, " % int(elem) else: # 3d2) string values out += "'%s', " % elem out += "]" option_value = out ## 3e) special cases: multiline if option_name == 'CFG_OAI_IDENTIFY_DESCRIPTION': # make triple quotes option_value = '""' + option_value + '""' ## 3f) ignore some options: if option_name.startswith('CFG_SITE_NAME_INTL'): # treated elsewhere return ## 4) finally, return output line: return '%s = %s' % (option_name, option_value) def cli_cmd_update_config_py(conf): """ Update new config.py from conf options, keeping previous config.py in a backup copy. """ print ">>> Going to update config.py..." ## location where config.py is: configpyfile = conf.get("Invenio", "CFG_PYLIBDIR") + \ os.sep + 'invenio' + os.sep + 'config.py' ## backup current config.py file: if os.path.exists(configpyfile): shutil.copy(configpyfile, configpyfile + '.OLD') ## here we go: fdesc = open(configpyfile, 'w') ## generate preamble: fdesc.write("# -*- coding: utf-8 -*-\n") fdesc.write("# DO NOT EDIT THIS FILE! IT WAS AUTOMATICALLY GENERATED\n") fdesc.write("# FROM INVENIO.CONF BY EXECUTING:\n") fdesc.write("# " + " ".join(sys.argv) + "\n") ## special treatment for CFG_SITE_NAME_INTL options: fdesc.write("CFG_SITE_NAME_INTL = {}\n") for lang in conf.get("Invenio", "CFG_SITE_LANGS").split(","): fdesc.write("CFG_SITE_NAME_INTL['%s'] = \"%s\"\n" % (lang, conf.get("Invenio", "CFG_SITE_NAME_INTL_" + lang))) ## special treatment for CFG_SITE_SECURE_URL that may be empty, in ## which case it should be put equal to CFG_SITE_URL: if not conf.get("Invenio", "CFG_SITE_SECURE_URL"): conf.set("Invenio", "CFG_SITE_SECURE_URL", conf.get("Invenio", "CFG_SITE_URL")) ## process all the options normally: sections = conf.sections() sections.sort() for section in sections: options = conf.options(section) options.sort() for option in options: if not option.startswith('CFG_DATABASE_'): # put all options except for db credentials into config.py line_out = convert_conf_option(option, conf.get(section, option)) if line_out: fdesc.write(line_out + "\n") ## FIXME: special treatment for experimental variables ## CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES and CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE ## (not offering them in invenio.conf since they will be refactored) fdesc.write("CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE = 0\n") fdesc.write("CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES = [0, 1,]\n") ## generate postamble: fdesc.write("") fdesc.write("# END OF GENERATED FILE") ## we are done: fdesc.close() print "You may want to restart Apache now." print ">>> config.py updated successfully." def cli_cmd_update_dbquery_py(conf): """ Update lib/dbquery.py file with DB parameters read from conf file. Note: this edits dbquery.py in situ, taking a backup first. Use only when you know what you are doing. """ print ">>> Going to update dbquery.py..." ## location where dbquery.py is: dbquerypyfile = conf.get("Invenio", "CFG_PYLIBDIR") + \ os.sep + 'invenio' + os.sep + 'dbquery.py' ## backup current dbquery.py file: if os.path.exists(dbquerypyfile): shutil.copy(dbquerypyfile, dbquerypyfile + '.OLD') ## replace db parameters: out = '' for line in open(dbquerypyfile, 'r').readlines(): match = re.search(r'^CFG_DATABASE_(HOST|PORT|NAME|USER|PASS)(\s*=\s*)\'.*\'$', line) if match: dbparam = 'CFG_DATABASE_' + match.group(1) out += "%s%s'%s'\n" % (dbparam, match.group(2), conf.get('Invenio', dbparam)) else: out += line fdesc = open(dbquerypyfile, 'w') fdesc.write(out) fdesc.close() print "You may want to restart Apache now." print ">>> dbquery.py updated successfully." def cli_cmd_update_dbexec(conf): """ Update bin/dbexec file with DB parameters read from conf file. Note: this edits dbexec in situ, taking a backup first. Use only when you know what you are doing. """ print ">>> Going to update dbexec..." ## location where dbexec is: dbexecfile = conf.get("Invenio", "CFG_BINDIR") + \ os.sep + 'dbexec' ## backup current dbexec file: if os.path.exists(dbexecfile): shutil.copy(dbexecfile, dbexecfile + '.OLD') ## replace db parameters via sed: out = '' for line in open(dbexecfile, 'r').readlines(): match = re.search(r'^CFG_DATABASE_(HOST|PORT|NAME|USER|PASS)(\s*=\s*)\'.*\'$', line) if match: dbparam = 'CFG_DATABASE_' + match.group(1) out += "%s%s'%s'\n" % (dbparam, match.group(2), conf.get("Invenio", dbparam)) else: out += line fdesc = open(dbexecfile, 'w') fdesc.write(out) fdesc.close() print ">>> dbexec updated successfully." def cli_cmd_update_bibconvert_tpl(conf): """ Update bibconvert/config/*.tpl files looking for 856 http://.../record/ lines, replacing URL with CFG_SITE_URL taken from conf file. Note: this edits tpl files in situ, taking a backup first. Use only when you know what you are doing. """ print ">>> Going to update bibconvert templates..." ## location where bibconvert/config/*.tpl are: tpldir = conf.get("Invenio", 'CFG_ETCDIR') + \ os.sep + 'bibconvert' + os.sep + 'config' ## find all *.tpl files: for tplfilename in os.listdir(tpldir): if tplfilename.endswith(".tpl"): ## change tpl file: tplfile = tpldir + os.sep + tplfilename shutil.copy(tplfile, tplfile + '.OLD') out = '' for line in open(tplfile, 'r').readlines(): match = re.search(r'^(.*)http://.*?/record/(.*)$', line) if match: out += "%s%s/record/%s\n" % (match.group(1), conf.get("Invenio", 'CFG_SITE_URL'), match.group(2)) else: out += line fdesc = open(tplfile, 'w') fdesc.write(out) fdesc.close() print ">>> bibconvert templates updated successfully." def cli_cmd_update_web_tests(conf): """ Update web test cases lib/webtest/test_*.html looking for <td>http://.+?[</] strings and replacing them with CFG_SITE_URL taken from conf file. Note: this edits test files in situ, taking a backup first. Use only when you know what you are doing. """ print ">>> Going to update web tests..." ## location where test_*.html files are: testdir = conf.get("Invenio", 'CFG_PREFIX') + os.sep + \ 'lib' + os.sep + 'webtest' + os.sep + 'invenio' ## find all test_*.html files: for testfilename in os.listdir(testdir): if testfilename.startswith("test_") and \ testfilename.endswith(".html"): ## change test file: testfile = testdir + os.sep + testfilename shutil.copy(testfile, testfile + '.OLD') out = '' for line in open(testfile, 'r').readlines(): match = re.search(r'^(.*<td>)http://.+?([</].*)$', line) if match: out += "%s%s%s\n" % (match.group(1), conf.get("Invenio", 'CFG_SITE_URL'), match.group(2)) else: match = re.search(r'^(.*<td>)/opt/cds-invenio(.*)$', line) if match: out += "%s%s%s\n" % (match.group(1), conf.get("Invenio", 'CFG_PREFIX'), match.group(2)) else: out += line fdesc = open(testfile, 'w') fdesc.write(out) fdesc.close() print ">>> web tests updated successfully." def cli_cmd_reset_sitename(conf): """ Reset collection-related tables with new CFG_SITE_NAME and CFG_SITE_NAME_INTL* read from conf files. """ print ">>> Going to reset CFG_SITE_NAME and CFG_SITE_NAME_INTL..." from invenio.dbquery import run_sql, IntegrityError # reset CFG_SITE_NAME: sitename = conf.get("Invenio", "CFG_SITE_NAME") try: run_sql("""INSERT INTO collection (id, name, dbquery, reclist) VALUES (1,%s,NULL,NULL)""", (sitename,)) except IntegrityError: run_sql("""UPDATE collection SET name=%s WHERE id=1""", (sitename,)) # reset CFG_SITE_NAME_INTL: for lang in conf.get("Invenio", "CFG_SITE_LANGS").split(","): sitename_lang = conf.get("Invenio", "CFG_SITE_NAME_INTL_" + lang) try: run_sql("""INSERT INTO collectionname (id_collection, ln, type, value) VALUES (%s,%s,%s,%s)""", (1, lang, 'ln', sitename_lang)) except IntegrityError: run_sql("""UPDATE collectionname SET value=%s WHERE ln=%s AND id_collection=1 AND type='ln'""", (sitename_lang, lang)) print "You may want to restart Apache now." print ">>> CFG_SITE_NAME and CFG_SITE_NAME_INTL* reset successfully." def cli_cmd_reset_recstruct_cache(conf): """If CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE is changed, this function will adapt the database to either store or not store the recstruct format.""" from invenio.intbitset import intbitset from invenio.dbquery import run_sql from invenio.search_engine import get_record from invenio.bibsched import server_pid, pidfile enable_recstruct_cache = conf.get("Invenio", "CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE") enable_recstruct_cache = enable_recstruct_cache in ('True', '1') pid = server_pid(ping_the_process=False) if pid: print >> sys.stderr, "ERROR: bibsched seems to run with pid %d, according to %s." % (pid, pidfile) print >> sys.stderr, " Please stop bibsched before running this procedure." sys.exit(1) if enable_recstruct_cache: print ">>> Searching records which need recstruct cache resetting; this may take a while..." all_recids = intbitset(run_sql("SELECT id FROM bibrec")) good_recids = intbitset(run_sql("SELECT bibrec.id FROM bibrec JOIN bibfmt ON bibrec.id = bibfmt.id_bibrec WHERE format='recstruct' AND modification_date < last_updated")) recids = all_recids - good_recids print ">>> Generating recstruct cache..." tot = len(recids) count = 0 for recid in recids: value = zlib.compress(marshal.dumps(get_record(recid))) run_sql("INSERT INTO bibfmt(id_bibrec, format, last_updated, value) VALUES(%s, 'recstruct', NOW(), %s)", (recid, value)) count += 1 if count % 1000 == 0: print " ... done records %s/%s" % (count, tot) if count % 1000 != 0: print " ... done records %s/%s" % (count, tot) print ">>> recstruct cache generated successfully." else: print ">>> Cleaning recstruct cache..." run_sql("DELETE FROM bibfmt WHERE format='recstruct'") def cli_cmd_reset_siteadminemail(conf): """ Reset user-related tables with new CFG_SITE_ADMIN_EMAIL read from conf files. """ print ">>> Going to reset CFG_SITE_ADMIN_EMAIL..." from invenio.dbquery import run_sql siteadminemail = conf.get("Invenio", "CFG_SITE_ADMIN_EMAIL") run_sql("DELETE FROM user WHERE id=1") run_sql("""INSERT INTO user (id, email, password, note, nickname) VALUES (1, %s, AES_ENCRYPT(email, ''), 1, 'admin')""", (siteadminemail,)) print "You may want to restart Apache now." print ">>> CFG_SITE_ADMIN_EMAIL reset successfully." def cli_cmd_reset_fieldnames(conf): """ Reset I18N field names such as author, title, etc and other I18N ranking method names such as word similarity. Their translations are taken from the PO files. """ print ">>> Going to reset I18N field names..." from invenio.messages import gettext_set_language, language_list_long from invenio.dbquery import run_sql, IntegrityError ## get field id and name list: field_id_name_list = run_sql("SELECT id, name FROM field") ## get rankmethod id and name list: rankmethod_id_name_list = run_sql("SELECT id, name FROM rnkMETHOD") ## update names for every language: for lang, dummy in language_list_long(): _ = gettext_set_language(lang) ## this list is put here in order for PO system to pick names ## suitable for translation field_name_names = {"any field": _("any field"), "title": _("title"), "author": _("author"), "abstract": _("abstract"), "keyword": _("keyword"), "report number": _("report number"), "subject": _("subject"), "reference": _("reference"), "fulltext": _("fulltext"), "collection": _("collection"), "division": _("division"), "year": _("year"), "journal": _("journal"), "experiment": _("experiment"), "record ID": _("record ID")} ## update I18N names for every language: for (field_id, field_name) in field_id_name_list: if field_name_names.has_key(field_name): try: run_sql("""INSERT INTO fieldname (id_field,ln,type,value) VALUES (%s,%s,%s,%s)""", (field_id, lang, 'ln', field_name_names[field_name])) except IntegrityError: run_sql("""UPDATE fieldname SET value=%s WHERE id_field=%s AND ln=%s AND type=%s""", (field_name_names[field_name], field_id, lang, 'ln',)) ## ditto for rank methods: rankmethod_name_names = {"wrd": _("word similarity"), "demo_jif": _("journal impact factor"), "citation": _("times cited"),} for (rankmethod_id, rankmethod_name) in rankmethod_id_name_list: try: run_sql("""INSERT INTO rnkMETHODNAME (id_rnkMETHOD,ln,type,value) VALUES (%s,%s,%s,%s)""", (rankmethod_id, lang, 'ln', rankmethod_name_names[rankmethod_name])) except IntegrityError: run_sql("""UPDATE rnkMETHODNAME SET value=%s WHERE id_rnkMETHOD=%s AND ln=%s AND type=%s""", (rankmethod_name_names[rankmethod_name], rankmethod_id, lang, 'ln',)) print ">>> I18N field names reset successfully." def test_db_connection(): """ Test DB connection, and if fails, advise user how to set it up. Useful to be called during table creation. """ print "Testing DB connection...", from invenio.textutils import wrap_text_in_a_box from invenio.dbquery import run_sql, Error ## first, test connection to the DB server: try: run_sql("SHOW TABLES") except Error, err: from invenio.dbquery import CFG_DATABASE_HOST, CFG_DATABASE_PORT, \ CFG_DATABASE_NAME, CFG_DATABASE_USER, CFG_DATABASE_PASS print wrap_text_in_a_box("""\ DATABASE CONNECTIVITY ERROR %(errno)d: %(errmsg)s.\n Perhaps you need to set up database and connection rights? If yes, then please login as MySQL admin user and run the following commands now: $ mysql -h %(dbhost)s -P %(dbport)s -u root -p mysql mysql> CREATE DATABASE %(dbname)s DEFAULT CHARACTER SET utf8; mysql> GRANT ALL PRIVILEGES ON %(dbname)s.* TO %(dbuser)s@%(webhost)s IDENTIFIED BY '%(dbpass)s'; mysql> QUIT The values printed above were detected from your configuration. If they are not right, then please edit your invenio.conf file and rerun 'inveniocfg --update-all' first. If the problem is of different nature, then please inspect the above error message and fix the problem before continuing.""" % \ {'errno': err.args[0], 'errmsg': err.args[1], 'dbname': CFG_DATABASE_NAME, 'dbhost': CFG_DATABASE_HOST, 'dbport': CFG_DATABASE_PORT, 'dbuser': CFG_DATABASE_USER, 'dbpass': CFG_DATABASE_PASS, 'webhost': CFG_DATABASE_HOST == 'localhost' and 'localhost' or os.popen('hostname -f', 'r').read().strip(), }) sys.exit(1) print "ok" ## second, test insert/select of a Unicode string to detect ## possible Python/MySQL/MySQLdb mis-setup: print "Testing Python/MySQL/MySQLdb UTF-8 chain...", try: beta_in_utf8 = "β" # Greek beta in UTF-8 is 0xCEB2 run_sql("CREATE TEMPORARY TABLE test__invenio__utf8 (x char(1), y varbinary(2)) DEFAULT CHARACTER SET utf8") run_sql("INSERT INTO test__invenio__utf8 (x, y) VALUES (%s, %s)", (beta_in_utf8, beta_in_utf8)) res = run_sql("SELECT x,y,HEX(x),HEX(y),LENGTH(x),LENGTH(y),CHAR_LENGTH(x),CHAR_LENGTH(y) FROM test__invenio__utf8") assert res[0] == ('\xce\xb2', '\xce\xb2', 'CEB2', 'CEB2', 2L, 2L, 1L, 2L) run_sql("DROP TEMPORARY TABLE test__invenio__utf8") except Exception, err: print wrap_text_in_a_box("""\ DATABASE RELATED ERROR %s\n A problem was detected with the UTF-8 treatment in the chain between the Python application, the MySQLdb connector, and the MySQL database. You may perhaps have installed older versions of some prerequisite packages?\n Please check the INSTALL file and please fix this problem before continuing.""" % err) sys.exit(1) print "ok" def cli_cmd_create_tables(conf): """Create and fill Invenio DB tables. Useful for the installation process.""" print ">>> Going to create and fill tables..." from invenio.config import CFG_PREFIX test_db_connection() for cmd in ["%s/bin/dbexec < %s/lib/sql/invenio/tabcreate.sql" % (CFG_PREFIX, CFG_PREFIX), "%s/bin/dbexec < %s/lib/sql/invenio/tabfill.sql" % (CFG_PREFIX, CFG_PREFIX)]: if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) cli_cmd_reset_sitename(conf) cli_cmd_reset_siteadminemail(conf) cli_cmd_reset_fieldnames(conf) for cmd in ["%s/bin/webaccessadmin -u admin -c -a" % CFG_PREFIX]: if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) print ">>> Tables created and filled successfully." def cli_cmd_load_webstat_conf(conf): print ">>> Going to load WebStat config..." from invenio.config import CFG_PREFIX cmd = "%s/bin/webstatadmin --load-config" % CFG_PREFIX if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) print ">>> WebStat config load successfully." def cli_cmd_drop_tables(conf): """Drop Invenio DB tables. Useful for the uninstallation process.""" print ">>> Going to drop tables..." from invenio.config import CFG_PREFIX from invenio.textutils import wrap_text_in_a_box, wait_for_user wait_for_user(wrap_text_in_a_box("""WARNING: You are going to destroy your database tables!""")) cmd = "%s/bin/dbexec < %s/lib/sql/invenio/tabdrop.sql" % (CFG_PREFIX, CFG_PREFIX) if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) print ">>> Tables dropped successfully." def cli_cmd_create_demo_site(conf): """Create demo site. Useful for testing purposes.""" print ">>> Going to create demo site..." from invenio.config import CFG_PREFIX from invenio.dbquery import run_sql run_sql("TRUNCATE schTASK") run_sql("TRUNCATE session") run_sql("DELETE FROM user WHERE email=''") for cmd in ["%s/bin/dbexec < %s/lib/sql/invenio/democfgdata.sql" % \ (CFG_PREFIX, CFG_PREFIX),]: if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) cli_cmd_reset_fieldnames(conf) # needed for I18N demo ranking method names for cmd in ["%s/bin/webaccessadmin -u admin -c -r -D" % CFG_PREFIX, "%s/bin/webcoll -u admin" % CFG_PREFIX, "%s/bin/webcoll 1" % CFG_PREFIX,]: if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) print ">>> Demo site created successfully." def cli_cmd_load_demo_records(conf): """Load demo records. Useful for testing purposes.""" from invenio.config import CFG_PREFIX from invenio.dbquery import run_sql print ">>> Going to load demo records..." run_sql("TRUNCATE schTASK") for cmd in ["%s/bin/bibupload -u admin -i %s/var/tmp/demobibdata.xml" % (CFG_PREFIX, CFG_PREFIX), "%s/bin/bibupload 1" % CFG_PREFIX, "%s/bin/bibindex -u admin" % CFG_PREFIX, "%s/bin/bibindex 2" % CFG_PREFIX, "%s/bin/bibreformat -u admin -o HB" % CFG_PREFIX, "%s/bin/bibreformat 3" % CFG_PREFIX, "%s/bin/webcoll -u admin" % CFG_PREFIX, "%s/bin/webcoll 4" % CFG_PREFIX, "%s/bin/bibrank -u admin" % CFG_PREFIX, "%s/bin/bibrank 5" % CFG_PREFIX,]: if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) print ">>> Demo records loaded successfully." def cli_cmd_remove_demo_records(conf): """Remove demo records. Useful when you are finished testing.""" print ">>> Going to remove demo records..." from invenio.config import CFG_PREFIX from invenio.dbquery import run_sql from invenio.textutils import wrap_text_in_a_box, wait_for_user wait_for_user(wrap_text_in_a_box("""WARNING: You are going to destroy your records and documents!""")) if os.path.exists(CFG_PREFIX + os.sep + 'var' + os.sep + 'data'): shutil.rmtree(CFG_PREFIX + os.sep + 'var' + os.sep + 'data') run_sql("TRUNCATE schTASK") for cmd in ["%s/bin/dbexec < %s/lib/sql/invenio/tabbibclean.sql" % (CFG_PREFIX, CFG_PREFIX), "%s/bin/webcoll -u admin" % CFG_PREFIX, "%s/bin/webcoll 1" % CFG_PREFIX,]: if os.system(cmd): print "ERROR: failed execution of", cmd sys.exit(1) print ">>> Demo records removed successfully." def cli_cmd_drop_demo_site(conf): """Drop demo site completely. Useful when you are finished testing.""" print ">>> Going to drop demo site..." from invenio.textutils import wrap_text_in_a_box, wait_for_user wait_for_user(wrap_text_in_a_box("""WARNING: You are going to destroy your site and documents!""")) cli_cmd_drop_tables(conf) cli_cmd_create_tables(conf) cli_cmd_remove_demo_records(conf) print ">>> Demo site dropped successfully." def cli_cmd_run_unit_tests(conf): """Run unit tests, usually on the working demo site.""" from invenio.testutils import build_and_run_unit_test_suite build_and_run_unit_test_suite() def cli_cmd_run_regression_tests(conf): """Run regression tests, usually on the working demo site.""" from invenio.testutils import build_and_run_regression_test_suite build_and_run_regression_test_suite() def cli_cmd_run_web_tests(conf): """Run web tests in a browser. Requires Firefox with Selenium IDE extension.""" from invenio.testutils import build_and_run_web_test_suite build_and_run_web_test_suite() def cli_cmd_create_apache_conf(conf): """ Create Apache conf files for this site, keeping previous files in a backup copy. """ print ">>> Going to create Apache conf files..." from invenio.textutils import wrap_text_in_a_box apache_conf_dir = conf.get("Invenio", 'CFG_ETCDIR') + \ os.sep + 'apache' + ## TODO: add here the distribution signatures of those distributions + ## that already provide some ports.conf configuration file. + for distribution_signature in ('redhat-release', 'debian_version'): + if os.path.exists(os.path.sep + 'etc' + os.path.sep + distribution_signature): + comment_out_listen_directive = '#' + break + else: + comment_out_listen_directive = '' if not os.path.exists(apache_conf_dir): os.mkdir(apache_conf_dir) apache_vhost_file = apache_conf_dir + os.sep + \ 'invenio-apache-vhost.conf' apache_vhost_ssl_file = apache_conf_dir + os.sep + \ 'invenio-apache-vhost-ssl.conf' apache_vhost_body = """\ AddDefaultCharset UTF-8 ServerSignature Off ServerTokens Prod NameVirtualHost *:80 -Listen 80 +WSGIRestrictStdout Off +WSGIDaemonProcess invenio processes=5 threads=1 display-name=%%{GROUP} +%(comment_out_listen_directive)sListen 80 <Files *.pyc> deny from all </Files> <Files *~> deny from all </Files> <VirtualHost *:80> ServerName %(servername)s ServerAlias %(serveralias)s ServerAdmin %(serveradmin)s DocumentRoot %(webdir)s <Directory %(webdir)s> Options FollowSymLinks MultiViews AllowOverride None Order allow,deny - allow from all + Allow from all + </Directory> + <Directory %(wsgidir)s> + WSGIProcessGroup invenio + Options FollowSymLinks MultiViews + AllowOverride None + Order allow,deny + Allow from all </Directory> ErrorLog %(logdir)s/apache.err LogLevel warn CustomLog %(logdir)s/apache.log combined DirectoryIndex index.en.html index.html - <LocationMatch "^(/+$|/index|/collection|/record|/author|/search|/browse|/youraccount|/youralerts|/yourbaskets|/yourmessages|/yourloans|/yourgroups|/yourtickets|/submit|/getfile|/comments|/error|/oai2d|/rss|/help|/journal|/openurl|/stats|/unapi|/exporter)"> - SetHandler python-program - PythonHandler invenio.webinterface_layout - PythonDebug On - </LocationMatch> - <Directory %(webdir)s> - AddHandler python-program .py - PythonHandler mod_python.publisher - PythonDebug On - </Directory> + Alias /img/ %(webdir)s/img/ + Alias /js/ %(webdir)s/js/ + Alias /export/ %(webdir)s/export/ + Alias /jsMath/ %(webdir)s/jsMath/ + Alias /fckeditor/ %(webdir)s/fckeditor/ + AliasMatch /sitemap-(.*) %(webdir)s/sitemap-$1 + Alias /robots.txt %(webdir)s/robots.txt + Alias /favicon.ico %(webdir)s/favicon.ico + WSGIScriptAlias / %(wsgidir)s/invenio.wsgi + WSGIPassAuthorization On </VirtualHost> """ % {'servername': conf.get('Invenio', 'CFG_SITE_URL').replace("http://", ""), 'serveralias': conf.get('Invenio', 'CFG_SITE_URL').replace("http://", "").split('.')[0], 'serveradmin': conf.get('Invenio', 'CFG_SITE_ADMIN_EMAIL'), 'webdir': conf.get('Invenio', 'CFG_WEBDIR'), 'logdir': conf.get('Invenio', 'CFG_LOGDIR'), + 'libdir' : conf.get('Invenio', 'CFG_PYLIBDIR'), + 'wsgidir' : os.path.join(conf.get('Invenio', 'CFG_PREFIX'), 'var', 'www-wsgi'), + 'comment_out_listen_directive' : comment_out_listen_directive } apache_vhost_ssl_body = """\ ServerSignature Off ServerTokens Prod -Listen 443 +%(comment_out_listen_directive)sListen 443 NameVirtualHost *:443 +WSGIRestrictStdout Off +WSGIDaemonProcess invenio processes=5 threads=1 display-name=%%{GROUP} #SSLCertificateFile /etc/apache2/ssl/apache.pem SSLCertificateFile /etc/apache2/ssl/server.crt SSLCertificateKeyFile /etc/apache2/ssl/server.key <Files *.pyc> deny from all </Files> <Files *~> deny from all </Files> <VirtualHost *:443> ServerName %(servername)s ServerAlias %(serveralias)s ServerAdmin %(serveradmin)s SSLEngine on DocumentRoot %(webdir)s <Directory %(webdir)s> Options FollowSymLinks MultiViews AllowOverride None Order allow,deny - allow from all + Allow from all + </Directory> + <Directory %(wsgidir)s> + Options FollowSymLinks MultiViews + AllowOverride None + Order allow,deny + Allow from all </Directory> ErrorLog %(logdir)s/apache-ssl.err LogLevel warn CustomLog %(logdir)s/apache-ssl.log combined DirectoryIndex index.en.html index.html - <LocationMatch "^(/+$|/index|/collection|/record|/author|/search|/browse|/youraccount|/youralerts|/yourbaskets|/yourmessages|/yourgroups|/yourtickets|/submit|/getfile|/comments|/error|/oai2d|/rss|/help|/journal|/openurl|/stats|/unapi|/exporter)"> - SetHandler python-program - PythonHandler invenio.webinterface_layout - PythonDebug On - </LocationMatch> - <Directory %(webdir)s> - AddHandler python-program .py - PythonHandler mod_python.publisher - PythonDebug On - </Directory> + Alias /img/ %(webdir)s/img/ + Alias /js/ %(webdir)s/js/ + Alias /export/ %(webdir)s/export/ + Alias /jsMath/ %(webdir)s/jsMath/ + Alias /fckeditor/ %(webdir)s/fckeditor/ + AliasMatch /sitemap-(.*) %(webdir)s/sitemap-$1 + Alias /robots.txt %(webdir)s/robots.txt + Alias /favicon.ico %(webdir)s/favicon.ico + WSGIScriptAlias / %(wsgidir)s/invenio.wsgi + WSGIPassAuthorization On </VirtualHost> """ % {'servername': conf.get('Invenio', 'CFG_SITE_SECURE_URL').replace("https://", ""), 'serveralias': conf.get('Invenio', 'CFG_SITE_SECURE_URL').replace("https://", "").split('.')[0], 'serveradmin': conf.get('Invenio', 'CFG_SITE_ADMIN_EMAIL'), 'webdir': conf.get('Invenio', 'CFG_WEBDIR'), 'logdir': conf.get('Invenio', 'CFG_LOGDIR'), + 'libdir' : conf.get('Invenio', 'CFG_PYLIBDIR'), + 'wsgidir' : os.path.join(conf.get('Invenio', 'CFG_PREFIX'), 'var', 'www-wsgi'), + 'comment_out_listen_directive' : comment_out_listen_directive } # write HTTP vhost snippet: if os.path.exists(apache_vhost_file): shutil.copy(apache_vhost_file, apache_vhost_file + '.OLD') fdesc = open(apache_vhost_file, 'w') fdesc.write(apache_vhost_body) fdesc.close() print "Created file", apache_vhost_file # write HTTPS vhost snippet: if conf.get('Invenio', 'CFG_SITE_SECURE_URL') != \ conf.get('Invenio', 'CFG_SITE_URL'): if os.path.exists(apache_vhost_ssl_file): shutil.copy(apache_vhost_ssl_file, apache_vhost_ssl_file + '.OLD') fdesc = open(apache_vhost_ssl_file, 'w') fdesc.write(apache_vhost_ssl_body) fdesc.close() print "Created file", apache_vhost_ssl_file print wrap_text_in_a_box("""\ Apache virtual host configurations for your site have been created. You can check created files and put the following include statements in your httpd.conf:\n Include %s Include %s """ % (apache_vhost_file, apache_vhost_ssl_file)) print ">>> Apache conf files created." def cli_cmd_get(conf, varname): """ Return value of VARNAME read from CONF files. Useful for third-party programs to access values of conf options such as CFG_PREFIX. Return None if VARNAME is not found. """ # do not pay attention to upper/lower case: varname = varname.lower() # do not pay attention to section names yet: all_options = {} for section in conf.sections(): for option in conf.options(section): all_options[option] = conf.get(section, option) return all_options.get(varname, None) def cli_cmd_list(conf): """ Print a list of all conf options and values from CONF. """ sections = conf.sections() sections.sort() for section in sections: options = conf.options(section) options.sort() for option in options: print option.upper(), '=', conf.get(section, option) def _grep_version_from_executable(path_to_exec, version_regexp): """ Try to detect a program version by digging into its binary PATH_TO_EXEC and looking for VERSION_REGEXP. Return program version as a string. Return empty string if not succeeded. """ from invenio.shellutils import run_shell_command exec_version = "" if os.path.exists(path_to_exec): dummy1, cmd2_out, dummy2 = run_shell_command("strings %s | grep %s", (path_to_exec, version_regexp)) if cmd2_out: for cmd2_out_line in cmd2_out.split("\n"): if len(cmd2_out_line) > len(exec_version): # the longest the better exec_version = cmd2_out_line return exec_version def detect_apache_version(): """ Try to detect Apache version by localizing httpd or apache executables and grepping inside binaries. Return list of all found Apache versions and paths. (For a given executable, the returned format is 'apache_version [apache_path]'.) Return empty list if no success. """ from invenio.shellutils import run_shell_command out = [] dummy1, cmd_out, dummy2 = run_shell_command("locate bin/httpd bin/apache") for apache in cmd_out.split("\n"): apache_version = _grep_version_from_executable(apache, '^Apache\/') if apache_version: out.append("%s [%s]" % (apache_version, apache)) return out -def detect_modpython_version(): - """ - Try to detect mod_python version, either from mod_python import or - from grepping inside mod_python.so, like Apache. Return list of - all found mod_python versions and paths. Return empty list if no - success. - """ - out = [] - try: - from mod_python import version - out.append(version) - except ImportError: - # try to detect via looking at mod_python.so: - from invenio.shellutils import run_shell_command - version = "" - dummy1, cmd_out, dummy2 = run_shell_command("locate /mod_python.so") - for modpython in cmd_out.split("\n"): - modpython_version = _grep_version_from_executable(modpython, - '^mod_python\/') - if modpython_version: - out.append("%s [%s]" % (modpython_version, modpython)) - return out - def cli_cmd_detect_system_details(conf): """ Detect and print system details such as Apache/Python/MySQL versions etc. Useful for debugging problems on various OS. """ import MySQLdb print ">>> Going to detect system details..." print "* Hostname: " + socket.gethostname() print "* Invenio version: " + conf.get("Invenio", "CFG_VERSION") print "* Python version: " + sys.version.replace("\n", " ") print "* Apache version: " + ";\n ".join(detect_apache_version()) - print "* mod_python version: " + ";\n ".join(detect_modpython_version()) print "* MySQLdb version: " + MySQLdb.__version__ try: from invenio.dbquery import run_sql print "* MySQL version:" for key, val in run_sql("SHOW VARIABLES LIKE 'version%'") + \ run_sql("SHOW VARIABLES LIKE 'charact%'") + \ run_sql("SHOW VARIABLES LIKE 'collat%'"): if False: print " - %s: %s" % (key, val) elif key in ['version', 'character_set_client', 'character_set_connection', 'character_set_database', 'character_set_results', 'character_set_server', 'character_set_system', 'collation_connection', 'collation_database', 'collation_server']: print " - %s: %s" % (key, val) except ImportError: print "* ERROR: cannot import dbquery" print ">>> System details detected successfully." def main(): """Main entry point.""" conf = ConfigParser() if '--help' in sys.argv or \ '-h' in sys.argv: print_usage() elif '--version' in sys.argv or \ '-V' in sys.argv: print_version() else: confdir = None if '--conf-dir' in sys.argv: try: confdir = sys.argv[sys.argv.index('--conf-dir') + 1] except IndexError: pass # missing --conf-dir argument value if not os.path.exists(confdir): print "ERROR: bad or missing --conf-dir option value." sys.exit(1) else: ## try to detect path to conf dir (relative to this bin dir): confdir = re.sub(r'/bin$', '/etc', sys.path[0]) ## read conf files: for conffile in [confdir + os.sep + 'invenio.conf', confdir + os.sep + 'invenio-autotools.conf', confdir + os.sep + 'invenio-local.conf',]: if os.path.exists(conffile): conf.read(conffile) else: if not conffile.endswith("invenio-local.conf"): # invenio-local.conf is optional, otherwise stop print "ERROR: Badly guessed conf file location", conffile print "(Please use --conf-dir option.)" sys.exit(1) ## decide what to do: done = False for opt_idx in range(0, len(sys.argv)): opt = sys.argv[opt_idx] if opt == '--conf-dir': # already treated before, so skip silently: pass elif opt == '--get': try: varname = sys.argv[opt_idx + 1] except IndexError: print "ERROR: bad or missing --get option value." sys.exit(1) if varname.startswith('-'): print "ERROR: bad or missing --get option value." sys.exit(1) varvalue = cli_cmd_get(conf, varname) if varvalue is not None: print varvalue else: sys.exit(1) done = True elif opt == '--list': cli_cmd_list(conf) done = True elif opt == '--detect-system-details': cli_cmd_detect_system_details(conf) done = True elif opt == '--create-tables': cli_cmd_create_tables(conf) done = True elif opt == '--load-webstat-conf': cli_cmd_load_webstat_conf(conf) done = True elif opt == '--drop-tables': cli_cmd_drop_tables(conf) done = True elif opt == '--create-demo-site': cli_cmd_create_demo_site(conf) done = True elif opt == '--load-demo-records': cli_cmd_load_demo_records(conf) done = True elif opt == '--remove-demo-records': cli_cmd_remove_demo_records(conf) done = True elif opt == '--drop-demo-site': cli_cmd_drop_demo_site(conf) done = True elif opt == '--run-unit-tests': cli_cmd_run_unit_tests(conf) done = True elif opt == '--run-regression-tests': cli_cmd_run_regression_tests(conf) done = True elif opt == '--run-web-tests': cli_cmd_run_web_tests(conf) done = True elif opt == '--update-all': cli_cmd_update_config_py(conf) cli_cmd_update_dbquery_py(conf) cli_cmd_update_dbexec(conf) cli_cmd_update_bibconvert_tpl(conf) cli_cmd_update_web_tests(conf) done = True elif opt == '--update-config-py': cli_cmd_update_config_py(conf) done = True elif opt == '--update-dbquery-py': cli_cmd_update_dbquery_py(conf) done = True elif opt == '--update-dbexec': cli_cmd_update_dbexec(conf) done = True elif opt == '--update-bibconvert-tpl': cli_cmd_update_bibconvert_tpl(conf) done = True elif opt == '--update-web-tests': cli_cmd_update_web_tests(conf) done = True elif opt == '--reset-all': cli_cmd_reset_sitename(conf) cli_cmd_reset_siteadminemail(conf) cli_cmd_reset_fieldnames(conf) cli_cmd_reset_recstruct_cache(conf) done = True elif opt == '--reset-sitename': cli_cmd_reset_sitename(conf) done = True elif opt == '--reset-siteadminemail': cli_cmd_reset_siteadminemail(conf) done = True elif opt == '--reset-fieldnames': cli_cmd_reset_fieldnames(conf) done = True elif opt == '--reset-recstruct-cache': cli_cmd_reset_recstruct_cache(conf) done = True elif opt == '--create-apache-conf': cli_cmd_create_apache_conf(conf) done = True elif opt.startswith("-") and opt != '--yes-i-know': print "ERROR: unknown option", opt sys.exit(1) if not done: print """ERROR: Please specify a command. Please see '--help'.""" sys.exit(1) if __name__ == '__main__': main() diff --git a/modules/miscutil/lib/urlutils.py b/modules/miscutil/lib/urlutils.py index 50a91123e..cd2fa850e 100644 --- a/modules/miscutil/lib/urlutils.py +++ b/modules/miscutil/lib/urlutils.py @@ -1,424 +1,409 @@ # -*- coding: utf-8 -*- ## ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ urlutils.py -- helper functions for URL related problems such as argument washing, redirection, etc. """ __revision__ = "$Id$" import re from urllib import urlencode, quote_plus, quote from urlparse import urlparse from cgi import parse_qs, escape -try: - from mod_python import apache, util -except ImportError: - pass +from invenio import webinterface_handler_wsgi_utils as apache from invenio.config import \ CFG_SITE_LANG, \ CFG_SITE_URL, \ CFG_WEBSTYLE_EMAIL_ADDRESSES_OBFUSCATION_MODE def wash_url_argument(var, new_type): """ Wash argument into 'new_type', that can be 'list', 'str', 'int', 'tuple' or 'dict'. If needed, the check 'type(var) is not None' should be done before calling this function. @param var: variable value @param new_type: variable type, 'list', 'str', 'int', 'tuple' or 'dict' @return: as much as possible, value var as type new_type If var is a list, will change first element into new_type. If int check unsuccessful, returns 0 """ out = [] if new_type == 'list': # return lst if type(var) is list: out = var else: out = [var] elif new_type == 'str': # return str if type(var) is list: try: out = "%s" % var[0] except: out = "" elif type(var) is str: out = var else: out = "%s" % var elif new_type == 'int': # return int if type(var) is list: try: out = int(var[0]) except: out = 0 elif type(var) is int: out = var elif type(var) is str: try: out = int(var) except: out = 0 else: out = 0 elif new_type == 'tuple': # return tuple if type(var) is tuple: out = var else: out = (var,) elif new_type == 'dict': # return dictionary if type(var) is dict: out = var else: out = {0:var} return out def redirect_to_url(req, url, redirection_type=None): """ Redirect current page to url. @param req: request as received from apache @param url: url to redirect to @param redirection_type: what kind of redirection is required: e.g.: apache.HTTP_MULTIPLE_CHOICES = 300 apache.HTTP_MOVED_PERMANENTLY = 301 apache.HTTP_MOVED_TEMPORARILY = 302 apache.HTTP_SEE_OTHER = 303 apache.HTTP_NOT_MODIFIED = 304 apache.HTTP_USE_PROXY = 305 apache.HTTP_TEMPORARY_REDIRECT = 307 The default is apache.HTTP_TEMPORARY_REDIRECT Please see: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3 """ if redirection_type is None: redirection_type = apache.HTTP_MOVED_TEMPORARILY - req.err_headers_out["Location"] = url + req.headers_out["Location"] = url del req.headers_out["Cache-Control"] - req.err_headers_out["Cache-Control"] = "no-cache, private, no-store, must-revalidate, post-check=0, pre-check=0, max-age=0" - req.err_headers_out["Pragma"] = "no-cache" - - if req.headers_out.has_key("Set-Cookie"): - cookies = req.headers_out['Set-Cookie'] - if type(cookies) is list: - for cookie in cookies: - req.err_headers_out.add("Set-Cookie", cookie) - else: - req.err_headers_out.add("Set-Cookie", cookies) + req.headers_out["Cache-Control"] = "no-cache, private, no-store, must-revalidate, post-check=0, pre-check=0, max-age=0" + req.headers_out["Pragma"] = "no-cache" - if req.sent_bodyct: + if req.response_sent_p: raise IOError, "Cannot redirect after headers have already been sent." req.status = redirection_type req.write('<p>Please go to <a href="%s">here</a></p>\n' % url) raise apache.SERVER_RETURN, apache.DONE -def get_client_ip_address(req): - """ Returns IP address as string from an apache request. """ - return str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) - def get_referer(req, replace_ampersands=False): """ Return the referring page of a request. Referer (wikipedia): Referer is a common misspelling of the word "referrer"; so common, in fact, that it made it into the official specification of HTTP. When visiting a webpage, the referer or referring page is the URL of the previous webpage from which a link was followed. @param req: request @param replace_ampersands: if 1, replace & by & in url (correct HTML cannot contain & characters alone). """ try: referer = req.headers_in['Referer'] if replace_ampersands == 1: return referer.replace('&', '&') return referer except KeyError: return '' def drop_default_urlargd(urlargd, default_urlargd): lndefault = {} lndefault.update(default_urlargd) ## Commented out. An Invenio URL now should always specify the desired ## language, in order not to raise the automatic language discovery ## (client browser language can be used now in place of CFG_SITE_LANG) # lndefault['ln'] = (str, CFG_SITE_LANG) canonical = {} canonical.update(urlargd) for k, v in urlargd.items(): try: d = lndefault[k] if d[1] == v: del canonical[k] except KeyError: pass return canonical def make_canonical_urlargd(urlargd, default_urlargd): """ Build up the query part of an URL from the arguments passed in the 'urlargd' dictionary. 'default_urlargd' is a secondary dictionary which contains tuples of the form (type, default value) for the query arguments (this is the same dictionary as the one you can pass to webinterface_handler.wash_urlargd). When a query element has its default value, it is discarded, so that the simplest (canonical) url query is returned. The result contains the initial '?' if there are actual query items remaining. """ canonical = drop_default_urlargd(urlargd, default_urlargd) if canonical: return '?' + urlencode(canonical, doseq=True).replace('&', '&') return '' def create_html_link(urlbase, urlargd, link_label, linkattrd={}, escape_urlargd=True, escape_linkattrd=True): """Creates a W3C compliant link. @param urlbase: base url (e.g. invenio.config.CFG_SITE_URL/search) @param urlargd: dictionary of parameters. (e.g. p={'recid':3, 'of'='hb'}) @param link_label: text displayed in a browser (has to be already escaped) @param linkattrd: dictionary of attributes (e.g. a={'class': 'img'}) @param escape_urlargd: boolean indicating if the function should escape arguments (e.g. < becomes < or " becomes ") @param escape_linkattrd: boolean indicating if the function should escape attributes (e.g. < becomes < or " becomes ") """ attributes_separator = ' ' output = '<a href="' + \ create_url(urlbase, urlargd, escape_urlargd) + '"' if linkattrd: output += ' ' if escape_linkattrd: attributes = [escape(str(key), quote=True) + '="' + \ escape(str(linkattrd[key]), quote=True) + '"' for key in linkattrd.keys()] else: attributes = [str(key) + '="' + str(linkattrd[key]) + '"' for key in linkattrd.keys()] output += attributes_separator.join(attributes) output += '>' + link_label + '</a>' return output def create_html_mailto(email, subject=None, body=None, cc=None, bcc=None, link_label="%(email)s", linkattrd=None, escape_urlargd=True, escape_linkattrd=True, email_obfuscation_mode=CFG_WEBSTYLE_EMAIL_ADDRESSES_OBFUSCATION_MODE): """Creates a W3C compliant 'mailto' link. Encode/encrypt given email to reduce undesired automated email harvesting when embedded in a web page. NOTE: there is no ultimate solution to protect against email harvesting. All have drawbacks and can more or less be circumvented. There are other techniques to protect email adresses. We implement the less annoying one for users. @param email: the recipient of the email @param subject: a default subject for the email (must not contain line feeds) @param body: a default body for the email @param cc: the co-recipient(s) of the email @param bcc: the hidden co-recpient(s) of the email @param link_label: the label of this mailto link. String replacement is performed on key %(email)s with the email address if needed. @param linkattrd: dictionary of attributes (e.g. a={'class': 'img'}) @param escape_urlargd: boolean indicating if the function should escape arguments (e.g. < becomes < or " becomes ") @param escape_linkattrd: boolean indicating if the function should escape attributes (e.g. < becomes < or " becomes ") @param email_obfuscation_mode: the protection mode. See below: You can choose among several modes to protect emails. It is advised to keep the default CFG_MISCUTIL_EMAIL_HARVESTING_PROTECTION value, so that it is possible for an admin to change the policy globally. Available modes ([t] means "transparent" for the user): -1: hide all emails, excepted CFG_SITE_ADMIN_EMAIL and CFG_SITE_SUPPORT_EMAIL. [t] 0 : no protection, email returned as is. foo@example.com => foo@example.com 1 : basic email munging: replaces @ by [at] and . by [dot] foo@example.com => foo [at] example [dot] com [t] 2 : transparent name mangling: characters are replaced by equivalent HTML entities. foo@example.com => foo@example.com [t] 3 : javascript insertion. Requires Javascript enabled on client side. 4 : replaces @ and . characters by gif equivalents. foo@example.com => foo<img src="at.gif" alt=" [at] ">example<img src="dot.gif" alt=" [dot] ">com """ # TODO: implement other protection modes to encode/encript email: # ## [t] 5 : form submission. User is redirected to a form that he can ## fills in to send the email (??Use webmessage??). ## Depending on WebAccess, ask to answer a question. ## ## [t] 6 : if user can see (controlled by WebAccess), display. Else ## ask to login to see email. If user cannot see, display ## form submission. if linkattrd is None: linkattrd = {} parameters = {} if subject: parameters["subject"] = subject if body: parameters["body"] = body.replace('\r\n', '\n').replace('\n', '\r\n') if cc: parameters["cc"] = cc if bcc: parameters["bcc"] = bcc # Preprocessing values for some modes if email_obfuscation_mode == 1: # Basic Munging email = email.replace("@", " [at] ").replace(".", " [dot] ") elif email_obfuscation_mode == 2: # Transparent name mangling email = string_to_numeric_char_reference(email) if '%(email)s' in link_label: link_label = link_label % {'email': email} mailto_link = create_html_link('mailto:' + email, parameters, link_label, linkattrd, escape_urlargd, escape_linkattrd) if email_obfuscation_mode == 0: # Return "as is" return mailto_link elif email_obfuscation_mode == 1: # Basic Munging return mailto_link elif email_obfuscation_mode == 2: # Transparent name mangling return mailto_link elif email_obfuscation_mode == 3: # Javascript-based return '''<script language="JavaScript" type="text/javascript">document.write('%s'.split("").reverse().join(""))</script>''' % \ mailto_link[::-1].replace("'", "\\'") elif email_obfuscation_mode == 4: # GIFs-based email = email.replace('.', '<img src="%s/img/dot.gif" alt=" [dot] " style="vertical-align:bottom" />' \ % CFG_SITE_URL) email = email.replace('@', '<img src="%s/img/at.gif" alt=" [at] " style="vertical-align:baseline" />' % \ CFG_SITE_URL) return email # All other cases, including mode -1: return "" def string_to_numeric_char_reference(string): """ Encode a string to HTML-compatible numeric character reference. Eg: encode_html_entities("abc") == 'abc' """ out = "" for char in string: out += "&#" + str(ord(char)) + ";" return out def create_url(urlbase, urlargd, escape_urlargd=True): """Creates a W3C compliant URL. Output will look like this: 'urlbase?param1=value1&param2=value2' @param urlbase: base url (e.g. invenio.config.CFG_SITE_URL/search) @param urlargd: dictionary of parameters. (e.g. p={'recid':3, 'of'='hb'} @param escape_urlargd: boolean indicating if the function should escape arguments (e.g. < becomes < or " becomes ") """ separator = '&' output = urlbase if urlargd: output += '?' if escape_urlargd: arguments = [escape(quote(str(key)), quote=True) + '=' + \ escape(quote(str(urlargd[key])), quote=True) for key in urlargd.keys()] else: arguments = [str(key) + '=' + str(urlargd[key]) for key in urlargd.keys()] output += separator.join(arguments) return output def same_urls_p(a, b): """ Compare two URLs, ignoring reorganizing of query arguments """ ua = list(urlparse(a)) ub = list(urlparse(b)) ua[4] = parse_qs(ua[4]) ub[4] = parse_qs(ub[4]) return ua == ub def urlargs_replace_text_in_arg(urlargs, regexp_argname, text_old, text_new): """Analyze `urlargs' (URL CGI GET query arguments in string form) and for each occurrence of argument matching `regexp_argname' replace every substring `text_old' by `text_new'. Return the resulting new URL. Used to be used for search engine's create_nearest_terms_box, now it is not used there anymore. It is left here in case it will become possibly useful later. """ out = "" # parse URL arguments into a dictionary: urlargsdict = parse_qs(urlargs) ## construct new URL arguments: urlargsdictnew = {} for key in urlargsdict.keys(): if re.match(regexp_argname, key): # replace `arg' by new values urlargsdictnew[key] = [] for parg in urlargsdict[key]: urlargsdictnew[key].append(parg.replace(text_old, text_new)) else: # keep old values urlargsdictnew[key] = urlargsdict[key] # build new URL for this word: for key in urlargsdictnew.keys(): for val in urlargsdictnew[key]: out += "&" + key + "=" + quote_plus(val, '') if out.startswith("&"): out = out[5:] return out diff --git a/modules/webalert/lib/webalert_webinterface.py b/modules/webalert/lib/webalert_webinterface.py index 6c29ad973..392af46c3 100644 --- a/modules/webalert/lib/webalert_webinterface.py +++ b/modules/webalert/lib/webalert_webinterface.py @@ -1,561 +1,562 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """PERSONAL FEATURES - YOUR ALERTS""" __revision__ = "$Id$" __lastupdated__ = """$Date$""" import sys import time import zlib import urllib -from mod_python import apache +from invenio import webinterface_handler_wsgi_utils as apache + from invenio.config import CFG_SITE_URL, CFG_SITE_SECURE_URL, CFG_SITE_LANG, CFG_SITE_NAME, \ CFG_ACCESS_CONTROL_LEVEL_SITE, CFG_SITE_NAME_INTL from invenio.webpage import page from invenio import webalert from invenio.webuser import getUid, page_not_authorized, isGuestUser from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory from invenio.urlutils import redirect_to_url, make_canonical_urlargd from invenio.webstat import register_customevent from invenio.errorlib import register_exception from invenio.webuser import collect_user_info from invenio.messages import gettext_set_language import invenio.template webalert_templates = invenio.template.load('webalert') class WebInterfaceYourAlertsPages(WebInterfaceDirectory): """Defines the set of /youralerts pages.""" _exports = ['', 'display', 'input', 'modify', 'list', 'add', 'update', 'remove'] def index(self, req, form): """Index page.""" redirect_to_url(req, '%s/youralerts/list' % CFG_SITE_URL) def display(self, req, form): """Display search history page. A misnomer.""" argd = wash_urlargd(form, {'p': (str, "n") }) uid = getUid(req) # load the right language _ = gettext_set_language(argd['ln']) if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "%s/youralerts/display" % \ (CFG_SITE_URL,), navmenuid="youralerts") elif uid == -1 or isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/youralerts/display%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usealerts']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use alerts.")) if argd['p'] == 'y': _title = _("Popular Searches") else: _title = _("Your Searches") # register event in webstat if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("alerts", ["display", "", user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title=_title, body=webalert.perform_display(argd['p'], uid, ln=argd['ln']), navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Display searches") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') def input(self, req, form): argd = wash_urlargd(form, {'idq': (int, None), 'name': (str, ""), 'freq': (str, "week"), 'notif': (str, "y"), 'idb': (int, 0), 'error_msg': (str, ""), }) uid = getUid(req) if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "%s/youralerts/input" % \ (CFG_SITE_URL,), navmenuid="youralerts") elif uid == -1 or isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/youralerts/input%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) # load the right language _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usealerts']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use alerts.")) try: html = webalert.perform_input_alert("add", argd['idq'], argd['name'], argd['freq'], argd['notif'], argd['idb'], uid, ln=argd['ln']) except webalert.AlertError, msg: return page(title=_("Error"), body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg), navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') if argd['error_msg'] != "": html = webalert_templates.tmpl_errorMsg( ln = argd['ln'], error_msg = argd['error_msg'], rest = html, ) # register event in webstat alert_str = "%s (%d)" % (argd['name'], argd['idq']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("alerts", ["input", alert_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title=_("Set a new alert"), body=html, navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') def modify(self, req, form): argd = wash_urlargd(form, {'idq': (int, None), 'old_idb': (int, None), 'name': (str, ""), 'freq': (str, "week"), 'notif': (str, "y"), 'idb': (int, 0), 'error_msg': (str, ""), }) uid = getUid(req) if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "%s/youralerts/modify" % \ (CFG_SITE_URL,), navmenuid="youralerts") elif uid == -1 or isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/youralerts/modify%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) # load the right language _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usealerts']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use alerts.")) try: html = webalert.perform_input_alert("update", argd['idq'], argd['name'], argd['freq'], argd['notif'], argd['idb'], uid, argd['old_idb'], ln=argd['ln']) except webalert.AlertError, msg: return page(title=_("Error"), body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg), navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') if argd['error_msg'] != "": html = webalert_templates.tmpl_errorMsg( ln = argd['ln'], error_msg = argd['error_msg'], rest = html, ) # register event in webstat alert_str = "%s (%d)" % (argd['name'], argd['idq']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("alerts", ["modify", alert_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title=_("Modify alert settings"), body=html, navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Modify alert settings") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') def list(self, req, form): argd = wash_urlargd(form, {}) uid = getUid(req) if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "%s/youralerts/list" % \ (CFG_SITE_URL,), navmenuid="youralerts") elif uid == -1 or isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/youralerts/list%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) # load the right language _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usealerts']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use alerts.")) # register event in webstat if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("alerts", ["list", "", user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title=_("Your Alerts"), body=webalert.perform_list_alerts(uid, ln = argd['ln']), navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') def add(self, req, form): argd = wash_urlargd(form, {'idq': (int, None), 'name': (str, None), 'freq': (str, None), 'notif': (str, None), 'idb': (int, None), }) uid = getUid(req) if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "%s/youralerts/add" % \ (CFG_SITE_URL,), navmenuid="youralerts") elif uid == -1 or isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/youralerts/add%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) # load the right language _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usealerts']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use alerts.")) try: html = webalert.perform_add_alert(argd['name'], argd['freq'], argd['notif'], argd['idb'], argd['idq'], uid, ln=argd['ln']) except webalert.AlertError, msg: return page(title=_("Error"), body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg), navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') # register event in webstat alert_str = "%s (%d)" % (argd['name'], argd['idq']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("alerts", ["add", alert_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title=_("Display alerts"), body=html, navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') def update(self, req, form): argd = wash_urlargd(form, {'name': (str, None), 'freq': (str, None), 'notif': (str, None), 'idb': (int, None), 'idq': (int, None), 'old_idb': (int, None), }) uid = getUid(req) if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "%s/youralerts/update" % \ (CFG_SITE_URL,), navmenuid="youralerts") elif uid == -1 or isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/youralerts/update%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) # load the right language _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usealerts']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use alerts.")) try: html = webalert.perform_update_alert(argd['name'], argd['freq'], argd['notif'], argd['idb'], argd['idq'], argd['old_idb'], uid, ln=argd['ln']) except webalert.AlertError, msg: return page(title=_("Error"), body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg), navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') # register event in webstat alert_str = "%s (%d)" % (argd['name'], argd['idq']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("alerts", ["update", alert_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title=_("Display alerts"), body=html, navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') def remove(self, req, form): argd = wash_urlargd(form, {'name': (str, None), 'idq': (int, None), 'idb': (int, None), }) uid = getUid(req) if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "%s/youralerts/remove" % \ (CFG_SITE_URL,), navmenuid="youralerts") elif uid == -1 or isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/youralerts/remove%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) # load the right language _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usealerts']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use alerts.")) try: html = webalert.perform_remove_alert(argd['name'], argd['idq'], argd['idb'], uid, ln=argd['ln']) except webalert.AlertError, msg: return page(title=_("Error"), body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg), navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') # register event in webstat alert_str = "%s (%d)" % (argd['name'], argd['idq']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("alerts", ["remove", alert_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") # display success return page(title=_("Display alerts"), body=html, navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % { 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln': argd['ln'], 'account' : _("Your Account"), }, description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME), uid=uid, language=argd['ln'], req=req, lastupdated=__lastupdated__, navmenuid='youralerts') diff --git a/modules/webbasket/lib/webbasket_webinterface.py b/modules/webbasket/lib/webbasket_webinterface.py index f8c0cc845..3710b4322 100644 --- a/modules/webbasket/lib/webbasket_webinterface.py +++ b/modules/webbasket/lib/webbasket_webinterface.py @@ -1,1306 +1,1307 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """WebBasket Web Interface.""" __revision__ = "$Id$" __lastupdated__ = """$Date$""" -from mod_python import apache +from invenio import webinterface_handler_wsgi_utils as apache + import os from invenio.config import CFG_SITE_URL, \ CFG_ACCESS_CONTROL_LEVEL_SITE, \ CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS, \ CFG_SITE_SECURE_URL, CFG_PREFIX from invenio.messages import gettext_set_language from invenio.webpage import page from invenio.webuser import getUid, page_not_authorized, isGuestUser from invenio.webbasket import \ check_user_can_comment, \ check_sufficient_rights, \ perform_request_display, \ create_guest_warning_box, \ create_basket_navtrail, \ perform_request_display_item, \ create_guest_warning_box, \ perform_request_write_comment, \ perform_request_save_comment, \ perform_request_delete_comment, \ perform_request_add_group, \ perform_request_edit, \ perform_request_list_public_baskets, \ perform_request_unsubscribe, \ perform_request_subscribe, \ create_infobox, \ perform_request_display_public, \ delete_record, \ move_record, \ perform_request_add, \ perform_request_create_basket, \ perform_request_delete from invenio.webbasket_config import CFG_WEBBASKET_CATEGORIES, \ CFG_WEBBASKET_ACTIONS, \ CFG_WEBBASKET_SHARE_LEVELS from invenio.webbasket_dblayer import get_basket_name, \ get_max_user_rights_on_basket from invenio.urlutils import get_referer, redirect_to_url, make_canonical_urlargd from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory from invenio.webstat import register_customevent from invenio.errorlib import register_exception from invenio.webuser import collect_user_info from invenio.webcomment import check_user_can_attach_file_to_comments from invenio.access_control_engine import acc_authorize_action try: from invenio.fckeditor_invenio_connector import FCKeditorConnectorInvenio fckeditor_available = True except ImportError, e: fckeditor_available = False from invenio.bibdocfile import stream_file class WebInterfaceBasketCommentsFiles(WebInterfaceDirectory): """Handle upload and access to files for comments in WebBasket. The upload is currently only available through the FCKeditor. """ def _lookup(self, component, path): """ This handler is invoked for the dynamic URLs (for getting and putting attachments) Eg: /yourbaskets/attachments/get/31/652/5/file/myfile.pdf /yourbaskets/attachments/get/31/552/5/image/myfigure.png bskid/recid/uid/ /yourbaskets/attachments/put/31/550/ bskid/recid """ if component == 'get' and len(path) > 4: bskid = path[0] # Basket id recid = path[1] # Record id uid = path[2] # uid of the submitter file_type = path[3] # file, image, flash or media (as # defined by FCKeditor) if file_type in ['file', 'image', 'flash', 'media']: file_name = '/'.join(path[4:]) # the filename def answer_get(req, form): """Accessing files attached to comments.""" form['file'] = file_name form['type'] = file_type form['uid'] = uid form['recid'] = recid form['bskid'] = bskid return self._get(req, form) return answer_get, [] elif component == 'put' and len(path) > 1: bskid = path[0] # Basket id recid = path[1] # Record id def answer_put(req, form): """Attaching file to a comment.""" form['recid'] = recid form['bskid'] = bskid return self._put(req, form) return answer_put, [] # All other cases: file not found return None, [] def _get(self, req, form): """ Returns a file attached to a comment. A file is attached to a comment of a record of a basket, by a user (who is the author of the comment), and is of a certain type (file, image, etc). Therefore these 5 values are part of the URL. Eg: CFG_SITE_URL/yourbaskets/attachments/get/31/91/5/file/myfile.pdf bskid/recid/uid """ argd = wash_urlargd(form, {'file': (str, None), 'type': (str, None), 'uid': (int, 0), 'bskid': (int, 0), 'recid': (int, 0)}) _ = gettext_set_language(argd['ln']) # Can user view this basket & record & comment, i.e. can user # access its attachments? uid = getUid(req) user_info = collect_user_info(req) rights = get_max_user_rights_on_basket(argd['uid'], argd['bskid']) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) if user_info['email'] == 'guest' and not user_info['apache_user']: # Ask to login target = '/youraccount/login' + \ make_canonical_urlargd({'ln' : argd['ln'], 'referer' : \ CFG_SITE_URL + user_info['uri']}, {}) return redirect_to_url(req, target) elif not(check_sufficient_rights(rights, CFG_WEBBASKET_SHARE_LEVELS['READITM'])): return page_not_authorized(req, "../", \ text = _("You are not authorized to view this attachment")) if not argd['file'] is None: # Prepare path to file on disk. Normalize the path so that # ../ and other dangerous components are removed. path = os.path.abspath('/opt/cds-invenio/var/data/baskets/comments/' + \ str(argd['bskid']) + '/' + str(argd['recid']) + '/' + \ str(argd['uid']) + '/' + argd['type'] + '/' + \ argd['file']) # Check that we are really accessing attachements # directory, for the declared basket and record. if path.startswith('/opt/cds-invenio/var/data/baskets/comments/' + \ str(argd['bskid']) + '/' + str(argd['recid'])) and \ os.path.exists(path): return stream_file(req, path) # Send error 404 in all other cases return(apache.HTTP_NOT_FOUND) def _put(self, req, form): """ Process requests received from FCKeditor to upload files, etc. URL eg: CFG_SITE_URL/yourbaskets/attachments/put/31/91/ bskid/recid/ """ if not fckeditor_available: return argd = wash_urlargd(form, {'bskid': (int, 0), 'recid': (int, 0)}) uid = getUid(req) # URL where the file can be fetched after upload user_files_path = '%(CFG_SITE_URL)s/yourbaskets/attachments/get/%(bskid)s/%(recid)i/%(uid)s' % \ {'uid': uid, 'recid': argd['recid'], 'bskid': argd['bskid'], 'CFG_SITE_URL': CFG_SITE_URL} # Path to directory where uploaded files are saved user_files_absolute_path = '%(CFG_PREFIX)s/var/data/baskets/comments/%(bskid)s/%(recid)s/%(uid)s' % \ {'uid': uid, 'recid': argd['recid'], 'bskid': argd['bskid'], 'CFG_PREFIX': CFG_PREFIX} # Create a Connector instance to handle the request conn = FCKeditorConnectorInvenio(form, recid=argd['recid'], uid=uid, allowed_commands=['QuickUpload'], allowed_types = ['File', 'Image', 'Flash', 'Media'], user_files_path = user_files_path, user_files_absolute_path = user_files_absolute_path) # Check that user can # 1. is logged in # 2. comment records of this basket (to simplify, we use # WebComment function to check this, even if it is not # entirely adequate) # 3. attach files user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_attach_file_to_comments(user_info, argd['recid']) if user_info['email'] == 'guest' and not user_info['apache_user']: # 1. User is guest: must login prior to upload data = conn.sendUploadResults(1, '', '', 'Please login before uploading file.') if not user_info['precached_usebaskets']: data = conn.sendUploadResults(1, '', '', 'Sorry, you are not allowed to use WebBasket') elif not check_user_can_comment(uid, argd['bskid']): # 2. User cannot edit comment of this basket data = conn.sendUploadResults(1, '', '', 'Sorry, you are not allowed to submit files') elif auth_code: # 3. User cannot submit data = conn.sendUploadResults(1, '', '', 'Sorry, you are not allowed to submit files.') else: # Process the upload and get the response data = conn.doResponse() # Transform the headers into something ok for mod_python for header in conn.headers: if not header is None: if header[0] == 'Content-Type': req.content_type = header[1] else: req.headers_out[header[0]] = header[1] # Send our response req.send_http_header() req.write(data) class WebInterfaceYourBasketsPages(WebInterfaceDirectory): """Defines the set of /yourbaskets pages.""" _exports = ['', 'display', 'display_item', 'write_comment', 'save_comment', 'delete_comment', 'add', 'delete', 'modify', 'edit', 'create_basket', 'display_public', 'list_public_baskets', 'unsubscribe', 'subscribe', 'attachments'] attachments = WebInterfaceBasketCommentsFiles() def index(self, req, form): """Index page.""" redirect_to_url(req, '%s/yourbaskets/display?%s' % (CFG_SITE_URL, req.args)) def display(self, req, form): """Display basket""" argd = wash_urlargd(form, {'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']), 'topic': (int, 0), 'group': (int, 0), 'bsk_to_sort': (int, 0), 'sort_by_title': (str, ""), 'sort_by_date': (str, ""), 'of': (str, '') }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/display", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/display%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) (body, errors, warnings) = perform_request_display(uid, argd['category'], argd['topic'], argd['group'], argd['ln']) if isGuestUser(uid): body = create_guest_warning_box(argd['ln']) + body navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) navtrail_end = create_basket_navtrail(uid=uid, category=argd['category'], topic=argd['topic'], group=argd['group'], ln=argd['ln']) # register event in webstat if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["display", "", user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = _("Display baskets"), body = body, navtrail = navtrail + navtrail_end, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def display_item(self, req, form): """ Display basket item """ argd = wash_urlargd(form, {'bskid': (int, 0), 'recid': (int, 0), 'format': (str, "hb"), 'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']), 'topic': (int, 0), 'group': (int, 0), 'of': (str, '') }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/display_item", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/display_item%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) (body, errors, warnings) = perform_request_display_item( uid=uid, bskid=argd['bskid'], recid=argd['recid'], format=argd['format'], category=argd['category'], topic=argd['topic'], group_id=argd['group'], ln=argd['ln']) if isGuestUser(uid): body = create_guest_warning_box(argd['ln']) + body navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) navtrail_end = create_basket_navtrail(uid=uid, category=argd['category'], topic=argd['topic'], group=argd['group'], bskid=argd['bskid'], ln=argd['ln']) # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["display", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = _("Details and comments"), body = body, navtrail = navtrail + navtrail_end, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def write_comment(self, req, form): """Write a comment (just interface for writing)""" argd = wash_urlargd(form, {'bskid': (int, 0), 'recid': (int, 0), 'cmtid': (int, 0), 'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']), 'topic': (int, 0), 'group': (int, 0), 'of' : (str, '') }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/write_comment", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/write_comment%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) (body, errors, warnings) = perform_request_write_comment( uid=uid, bskid=argd['bskid'], recid=argd['recid'], cmtid=argd['cmtid'], category=argd['category'], topic=argd['topic'], group_id=argd['group'], ln=argd['ln']) navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) navtrail_end = create_basket_navtrail(uid=uid, category=argd['category'], topic=argd['topic'], group=argd['group'], bskid=argd['bskid'], ln=argd['ln']) # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["write_comment", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = _("Write a comment"), body = body, navtrail = navtrail + navtrail_end, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def save_comment(self, req, form): """Save comment on record in basket""" argd = wash_urlargd(form, {'bskid': (int, 0), 'recid': (int, 0), 'title': (str, ""), 'text': (str, ""), 'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']), 'topic': (int, 0), 'group': (int, 0), 'of' : (str, ''), 'editor_type':(str, ""), }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/save_comment", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/save_comment%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) (errors_saving, infos) = perform_request_save_comment( uid=uid, bskid=argd['bskid'], recid=argd['recid'], title=argd['title'], text=argd['text'], ln=argd['ln'], editor_type=argd['editor_type']) (body, errors_displaying, warnings) = perform_request_display_item( uid=uid, bskid=argd['bskid'], recid=argd['recid'], format='hb', category=argd['category'], topic=argd['topic'], group_id=argd['group'], ln=argd['ln']) body = create_infobox(infos) + body errors = errors_saving.extend(errors_displaying) navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) navtrail_end = create_basket_navtrail(uid=uid, category=argd['category'], topic=argd['topic'], group=argd['group'], bskid=argd['bskid'], ln=argd['ln']) # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["save_comment", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = _("Details and comments"), body = body, navtrail = navtrail + navtrail_end, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def delete_comment(self, req, form): """Delete a comment @param bskid: id of basket (int) @param recid: id of record (int) @param cmtid: id of comment (int) @param category: category (see webbasket_config) (str) @param topic: nb of topic currently displayed (int) @param group: id of group baskets currently displayed (int) @param ln: language""" argd = wash_urlargd(form, {'bskid': (int, 0), 'recid': (int, 0), 'cmtid': (int, 0), 'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']), 'topic': (int, 0), 'group': (int, 0), 'of' : (str, '') }) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/delete_comment", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/delete_comment%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/display%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) url = CFG_SITE_URL + '/yourbaskets/display_item?recid=%i&bskid=%i' % \ (argd['recid'], argd['bskid']) url += '&category=%s&topic=%i&group=%i&ln=%s' % \ (argd['category'], argd['topic'], argd['group'], argd['ln']) errors = perform_request_delete_comment(uid, argd['bskid'], argd['recid'], argd['cmtid']) if not(len(errors)): redirect_to_url(req, url) else: # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) user_info = collect_user_info(req) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["delete_comment", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(uid = uid, title = '', body = '', language = argd['ln'], errors = errors, req = req, navmenuid = 'yourbaskets', of = argd['of']) def add(self, req, form): """Add records to baskets. @param recid: list of records to add @param bskids: list of baskets to add records to. if not provided, will return a page where user can select baskets @param referer: URL of the referring page @param new_basket_name: add record to new basket @param new_topic_name: new basket goes into new topic @param create_in_topic: # of topic to put basket into @param ln: language""" argd = wash_urlargd(form, {'recid': (list, []), 'bskids': (list, []), 'referer': (str, ""), 'new_basket_name': (str, ""), 'new_topic_name': (str, ""), 'create_in_topic': (int, -1), "of" : (str, '') }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/add", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/add%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) if not argd['referer']: argd['referer'] = get_referer(req) (body, errors, warnings) = perform_request_add( uid=uid, recids=argd['recid'], bskids=argd['bskids'], referer=argd['referer'], new_basket_name=argd['new_basket_name'], new_topic_name=argd['new_topic_name'], create_in_topic=argd['create_in_topic'], ln=argd['ln']) if isGuestUser(uid): body = create_guest_warning_box(argd['ln']) + body if not(len(warnings)) : title = _("Your Baskets") else: title = _("Add records to baskets") navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) # register event in webstat basket_str = ["%s (%s)" % (get_basket_name(bskid), bskid) for bskid in argd['bskids']] if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["add", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = title, body = body, navtrail = navtrail, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def delete(self, req, form): """Delete basket interface""" argd = wash_urlargd(form, {'bskid': (int, -1), 'confirmed': (int, 0), 'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']), 'topic': (int, 0), 'group': (int, 0), 'of' : (str, '') }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/delete", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/delete%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) (body, errors, warnings)=perform_request_delete( uid=uid, bskid=argd['bskid'], confirmed=argd['confirmed'], category=argd['category'], selected_topic=argd['topic'], selected_group_id=argd['group'], ln=argd['ln']) if argd['confirmed']: url = CFG_SITE_URL url += '/yourbaskets/display?category=%s&topic=%i&group=%i&ln=%s' % \ (argd['category'], argd['topic'], argd['group'], argd['ln']) redirect_to_url(req, url) else: navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) navtrail_end = create_basket_navtrail(uid=uid, category=argd['category'], topic=argd['topic'], group=argd['group'], bskid=argd['bskid'], ln=argd['ln']) if isGuestUser(uid): body = create_guest_warning_box(argd['ln']) + body # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["delete", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = _("Delete a basket"), body = body, navtrail = navtrail + navtrail_end, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def modify(self, req, form): """Modify basket content interface (reorder, suppress record, etc.)""" argd = wash_urlargd(form, {'action': (str, ""), 'bskid': (int, -1), 'recid': (int, 0), 'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']), 'topic': (int, 0), 'group': (int, 0), 'of' : (str, '') }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/modify", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/modify%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) url = CFG_SITE_URL url += '/yourbaskets/display?category=%s&topic=%i&group=%i&ln=%s' % \ (argd['category'], argd['topic'], argd['group'], argd['ln']) if argd['action'] == CFG_WEBBASKET_ACTIONS['DELETE']: delete_record(uid, argd['bskid'], argd['recid']) redirect_to_url(req, url) elif argd['action'] == CFG_WEBBASKET_ACTIONS['UP']: move_record(uid, argd['bskid'], argd['recid'], argd['action']) redirect_to_url(req, url) elif argd['action'] == CFG_WEBBASKET_ACTIONS['DOWN']: move_record(uid, argd['bskid'], argd['recid'], argd['action']) redirect_to_url(req, url) elif argd['action'] == CFG_WEBBASKET_ACTIONS['COPY']: title = _("Copy record to basket") referer = get_referer(req) (body, errors, warnings) = perform_request_add(uid=uid, recids=argd['recid'], referer=referer, ln=argd['ln']) if isGuestUser(uid): body = create_guest_warning_box(argd['ln']) + body else: title = '' body = '' warnings = '' errors = [('ERR_WEBBASKET_UNDEFINED_ACTION',)] navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) navtrail_end = create_basket_navtrail(uid=uid, category=argd['category'], topic=argd['topic'], group=argd['group'], bskid=argd['bskid'], ln=argd['ln']) # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["modify", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = title, body = body, navtrail = navtrail + navtrail_end, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def edit(self, req, form): """Edit basket interface""" argd = wash_urlargd(form, {'bskid': (int, 0), 'groups': (list, []), 'topic': (int, 0), 'add_group': (str, ""), 'group_cancel': (str, ""), 'submit': (str, ""), 'cancel': (str, ""), 'delete': (str, ""), 'new_name': (str, ""), 'new_topic': (int, -1), 'new_topic_name': (str, ""), 'new_group': (str, ""), 'external': (str, ""), 'of' : (str, '') }) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/edit", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/edit%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) if argd['cancel']: url = CFG_SITE_URL + '/yourbaskets/display?category=%s&topic=%i&ln=%s' url %= (CFG_WEBBASKET_CATEGORIES['PRIVATE'], argd['topic'], argd['ln']) redirect_to_url(req, url) elif argd['delete']: url = CFG_SITE_URL url += '/yourbaskets/delete?bskid=%i&category=%s&topic=%i&ln=%s' % \ (argd['bskid'], CFG_WEBBASKET_CATEGORIES['PRIVATE'], argd['topic'], argd['ln']) redirect_to_url(req, url) elif argd['add_group'] and not(argd['new_group']): body = perform_request_add_group(uid=uid, bskid=argd['bskid'], topic=argd['topic'], ln=argd['ln']) errors = [] warnings = [] elif (argd['add_group'] and argd['new_group']) or argd['group_cancel']: if argd['add_group']: perform_request_add_group(uid=uid, bskid=argd['bskid'], topic=argd['topic'], group_id=argd['new_group'], ln=argd['ln']) (body, errors, warnings) = perform_request_edit(uid=uid, bskid=argd['bskid'], topic=argd['topic'], ln=argd['ln']) elif argd['submit']: (body, errors, warnings) = perform_request_edit( uid=uid, bskid=argd['bskid'], topic=argd['topic'], new_name=argd['new_name'], new_topic=argd['new_topic'], new_topic_name=argd['new_topic_name'], groups=argd['groups'], external=argd['external'], ln=argd['ln']) if argd['new_topic'] != -1: argd['topic'] = argd['new_topic'] url = CFG_SITE_URL + '/yourbaskets/display?category=%s&topic=%i&ln=%s' % \ (CFG_WEBBASKET_CATEGORIES['PRIVATE'], argd['topic'], argd['ln']) redirect_to_url(req, url) else: (body, errors, warnings) = perform_request_edit(uid=uid, bskid=argd['bskid'], topic=argd['topic'], ln=argd['ln']) navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\ '%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) navtrail_end = create_basket_navtrail( uid=uid, category=CFG_WEBBASKET_CATEGORIES['PRIVATE'], topic=argd['topic'], group=0, bskid=argd['bskid'], ln=argd['ln']) if isGuestUser(uid): body = create_guest_warning_box(argd['ln']) + body # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["edit", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = _("Edit basket"), body = body, navtrail = navtrail + navtrail_end, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def create_basket(self, req, form): """Create basket interface""" argd = wash_urlargd(form, {'new_basket_name': (str, ""), 'new_topic_name': (str, ""), 'create_in_topic': (int, -1), 'topic_number': (int, -1), 'of' : (str, ''), }) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../yourbaskets/create_basket", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/create_basket%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) user_info = collect_user_info(req) _ = gettext_set_language(argd['ln']) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) if argd['new_basket_name'] and \ (argd['new_topic_name'] or argd['create_in_topic'] != -1): topic = perform_request_create_basket( uid=uid, new_basket_name=argd['new_basket_name'], new_topic_name=argd['new_topic_name'], create_in_topic=argd['create_in_topic'], ln=argd['ln']) # register event in webstat basket_str = "%s ()" % argd['new_basket_name'] if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["create_basket", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") url = CFG_SITE_URL + '/yourbaskets/display?category=%s&topic=%i&ln=%s' url %= (CFG_WEBBASKET_CATEGORIES['PRIVATE'], int(topic), argd['ln']) redirect_to_url(req, url) else: (body, errors, warnings) = perform_request_create_basket( uid=uid, new_basket_name=argd['new_basket_name'], new_topic_name=argd['new_topic_name'], create_in_topic=argd['create_in_topic'], topic_number=argd['topic_number'], ln=argd['ln']) navtrail = '<a class="navtrail" href="%s/youraccount/'\ 'display?ln=%s">%s</a>' navtrail %= (CFG_SITE_URL, argd['ln'], _("Your Account")) if isGuestUser(uid): body = create_guest_warning_box(argd['ln']) + body return page(title = _("Create basket"), body = body, navtrail = navtrail, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def display_public(self, req, form): """Display public basket. If of is x** then output will be XML""" argd = wash_urlargd(form, {'bskid': (int, 0), 'of': (str, "hb"), }) _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE == 2: return page_not_authorized(req, "../yourbaskets/display_public", navmenuid = 'yourbaskets') user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) if argd['bskid'] == 0: # No given basket => display list of public baskets (body, errors, warnings) = perform_request_list_public_baskets( 0, 1, 1, argd['ln']) return page(title = _("List of public baskets"), body = body, navtrail = '', uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, of = argd['of']) if len(argd['of']) and argd['of'][0]=='x': # XML output req.content_type = "text/xml" req.send_http_header() return perform_request_display_public(bskid=argd['bskid'], of=argd['of'], ln=argd['ln']) (body, errors, warnings) = perform_request_display_public( bskid=argd['bskid'], ln=argd['ln']) referer = get_referer(req) if 'list_public_basket' not in referer: referer = CFG_SITE_URL + '/yourbaskets/list_public_baskets?ln=' + \ argd['ln'] navtrail = '<a class="navtrail" href="%s">%s</a>' % \ (referer, _("List of public baskets")) # register event in webstat basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid']) if user_info['email']: user_str = "%s (%d)" % (user_info['email'], user_info['uid']) else: user_str = "" try: register_customevent("baskets", ["display_public", basket_str, user_str]) except: register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'") return page(title = _("Public basket"), body = body, navtrail = navtrail, uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def list_public_baskets(self, req, form): """List of public baskets interface""" argd = wash_urlargd(form, {'inf_limit': (int, 0), 'order': (int, 1), 'asc': (int, 1), 'of': (str, '') }) if argd['inf_limit'] < 0: argd['inf_limit'] = 0 _ = gettext_set_language(argd['ln']) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE == 2: return page_not_authorized(req, "../yourbaskets/list_public_baskets", navmenuid = 'yourbaskets') user_info = collect_user_info(req) ## This is to public to require user to be logged in to visit them... #if not user_info['precached_usebaskets']: #return page_not_authorized(req, "../", \ #text = _("You are not authorized to use baskets.")) (body, errors, warnings) = perform_request_list_public_baskets( argd['inf_limit'], argd['order'], argd['asc'], argd['ln']) return page(title = _("List of public baskets"), body = body, navtrail = '', uid = uid, lastupdated = __lastupdated__, language = argd['ln'], errors = errors, warnings = warnings, req = req, navmenuid = 'yourbaskets', of = argd['of']) def unsubscribe(self, req, form): """unsubscribe to basket""" argd = wash_urlargd(form, {'bskid': (int, 0), 'of': (str, '') }) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE == 2: return page_not_authorized(req, "../yourbaskets/unsubscribe", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/unsubscribe%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) perform_request_unsubscribe(uid, argd['bskid']) url = CFG_SITE_URL + '/yourbaskets/display?category=%s&ln=%s' url %= (CFG_WEBBASKET_CATEGORIES['EXTERNAL'], argd['ln']) redirect_to_url(req, url) def subscribe(self, req, form): """subscribe to basket""" argd = wash_urlargd(form, {'bskid': (int, 0), 'of': (str, '') }) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE == 2: return page_not_authorized(req, "../yourbaskets/subscribe", navmenuid = 'yourbaskets') if isGuestUser(uid): if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : "%s/yourbaskets/subscribe%s" % ( CFG_SITE_URL, make_canonical_urlargd(argd, {})), "ln" : argd['ln']}, {}))) _ = gettext_set_language(argd['ln']) user_info = collect_user_info(req) if not user_info['precached_usebaskets']: return page_not_authorized(req, "../", \ text = _("You are not authorized to use baskets.")) errors = perform_request_subscribe(uid, argd['bskid']) if len(errors): return page(errors=errors, uid=uid, language=argd['ln'], body = '', title = '', req=req, navmenuid = 'yourbaskets') url = CFG_SITE_URL + '/yourbaskets/display?category=%s&ln=%s' url %= (CFG_WEBBASKET_CATEGORIES['EXTERNAL'], argd['ln']) redirect_to_url(req, url) diff --git a/modules/webcomment/lib/webcomment.py b/modules/webcomment/lib/webcomment.py index ac1460a84..f8a44dbdc 100644 --- a/modules/webcomment/lib/webcomment.py +++ b/modules/webcomment/lib/webcomment.py @@ -1,1183 +1,1183 @@ # -*- coding: utf-8 -*- ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Comments and reviews for records """ __revision__ = "$Id$" # non CDS Invenio imports: import time import math from datetime import datetime, timedelta # CDS Invenio imports: from invenio.dbquery import run_sql from invenio.config import CFG_SITE_LANG, \ CFG_WEBALERT_ALERT_ENGINE_EMAIL,\ CFG_SITE_ADMIN_EMAIL,\ CFG_SITE_URL,\ CFG_SITE_NAME,\ CFG_WEBCOMMENT_ALLOW_REVIEWS,\ CFG_WEBCOMMENT_ALLOW_SHORT_REVIEWS,\ CFG_WEBCOMMENT_ALLOW_COMMENTS,\ CFG_WEBCOMMENT_ADMIN_NOTIFICATION_LEVEL,\ CFG_WEBCOMMENT_NB_REPORTS_BEFORE_SEND_EMAIL_TO_ADMIN,\ CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_COMMENTS_IN_SECONDS,\ CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_REVIEWS_IN_SECONDS from invenio.webmessage_mailutils import \ email_quote_txt, \ email_quoted_txt2html from invenio.webuser import get_user_info from invenio.dateutils import convert_datetext_to_dategui, \ datetext_default, \ convert_datestruct_to_datetext from invenio.mailutils import send_email from invenio.messages import wash_language, gettext_set_language from invenio.urlutils import wash_url_argument from invenio.webcomment_config import CFG_WEBCOMMENT_ACTION_CODE from invenio.access_control_engine import acc_authorize_action from invenio.access_control_admin import acc_is_role from invenio.access_control_config import CFG_WEBACCESS_WARNING_MSGS from invenio.search_engine import \ guess_primary_collection_of_a_record, \ check_user_can_view_record try: import invenio.template webcomment_templates = invenio.template.load('webcomment') except: pass def perform_request_display_comments_or_remarks(recID, ln=CFG_SITE_LANG, display_order='od', display_since='all', nb_per_page=100, page=1, voted=-1, reported=-1, reviews=0, uid=-1, can_send_comments=False, can_attach_files=False): """ Returns all the comments (reviews) of a specific internal record or external basket record. @param recID: record id where (internal record IDs > 0) or (external basket record IDs < -100) @param display_order: hh = highest helpful score, review only lh = lowest helpful score, review only hs = highest star score, review only ls = lowest star score, review only od = oldest date nd = newest date @param display_since: all= no filtering by date nd = n days ago nw = n weeks ago nm = n months ago ny = n years ago where n is a single digit integer between 0 and 9 @param nb_per_page: number of results per page @param page: results page @param voted: boolean, active if user voted for a review, see perform_request_vote function @param reported: boolean, active if user reported a certain comment/review, perform_request_report function @param reviews: boolean, enabled if reviews, disabled for comments @param uid: the id of the user who is reading comments @param can_send_comments: if user can send comment or not @oaram can_attach_files: if user can attach file to comment or not @return: html body. """ errors = [] warnings = [] nb_reviews = 0 nb_comments = 0 # wash arguments recID = wash_url_argument(recID, 'int') ln = wash_language(ln) display_order = wash_url_argument(display_order, 'str') display_since = wash_url_argument(display_since, 'str') nb_per_page = wash_url_argument(nb_per_page, 'int') page = wash_url_argument(page, 'int') voted = wash_url_argument(voted, 'int') reported = wash_url_argument(reported, 'int') reviews = wash_url_argument(reviews, 'int') # vital argument check (valid, error_body) = check_recID_is_in_range(recID, warnings, ln) if not(valid): return (error_body, errors, warnings) # Query the database and filter results res = query_retrieve_comments_or_remarks(recID, display_order, display_since, reviews) res2 = query_retrieve_comments_or_remarks(recID, display_order, display_since, not reviews) nb_res = len(res) if reviews: nb_reviews = nb_res nb_comments = len(res2) else: nb_reviews = len(res2) nb_comments = nb_res # checking non vital arguemnts - will be set to default if wrong #if page <= 0 or page.lower() != 'all': if page < 0: page = 1 warnings.append(('WRN_WEBCOMMENT_INVALID_PAGE_NB',)) if nb_per_page < 0: nb_per_page = 100 warnings.append(('WRN_WEBCOMMENT_INVALID_NB_RESULTS_PER_PAGE',)) if CFG_WEBCOMMENT_ALLOW_REVIEWS and reviews: if display_order not in ['od', 'nd', 'hh', 'lh', 'hs', 'ls']: display_order = 'hh' warnings.append(('WRN_WEBCOMMENT_INVALID_REVIEW_DISPLAY_ORDER',)) else: if display_order not in ['od', 'nd']: display_order = 'od' warnings.append(('WRN_WEBCOMMENT_INVALID_DISPLAY_ORDER',)) # filter results according to page and number of reults per page if nb_per_page > 0: if nb_res > 0: last_page = int(math.ceil(nb_res / float(nb_per_page))) else: last_page = 1 if page > last_page: page = 1 warnings.append(("WRN_WEBCOMMENT_INVALID_PAGE_NB",)) if nb_res > nb_per_page: # if more than one page of results if page < last_page: res = res[(page-1)*(nb_per_page) : (page*nb_per_page)] else: res = res[(page-1)*(nb_per_page) : ] else: # one page of results pass else: last_page = 1 # Send to template avg_score = 0.0 if not CFG_WEBCOMMENT_ALLOW_COMMENTS and not CFG_WEBCOMMENT_ALLOW_REVIEWS: # comments not allowed by admin errors.append(('ERR_WEBCOMMENT_COMMENTS_NOT_ALLOWED',)) if reported > 0: warnings.append(('WRN_WEBCOMMENT_FEEDBACK_RECORDED',)) elif reported == 0: warnings.append(('WRN_WEBCOMMENT_ALREADY_REPORTED',)) elif reported == -2: warnings.append(('WRN_WEBCOMMENT_INVALID_REPORT',)) if CFG_WEBCOMMENT_ALLOW_REVIEWS and reviews: avg_score = calculate_avg_score(res) if voted > 0: warnings.append(('WRN_WEBCOMMENT_FEEDBACK_RECORDED',)) elif voted == 0: warnings.append(('WRN_WEBCOMMENT_ALREADY_VOTED',)) body = webcomment_templates.tmpl_get_comments(recID, ln, nb_per_page, page, last_page, display_order, display_since, CFG_WEBCOMMENT_ALLOW_REVIEWS, res, nb_comments, avg_score, warnings, border=0, reviews=reviews, total_nb_reviews=nb_reviews, uid=uid, can_send_comments=can_send_comments, can_attach_files=can_attach_files) return (body, errors, warnings) def perform_request_vote(cmt_id, client_ip_address, value, uid=-1): """ Vote positively or negatively for a comment/review @param cmt_id: review id @param value: +1 for voting positively -1 for voting negatively @return: integer 1 if successful, integer 0 if not """ cmt_id = wash_url_argument(cmt_id, 'int') client_ip_address = wash_url_argument(client_ip_address, 'str') value = wash_url_argument(value, 'int') uid = wash_url_argument(uid, 'int') if cmt_id > 0 and value in [-1, 1] and check_user_can_vote(cmt_id, client_ip_address, uid): action_date = convert_datestruct_to_datetext(time.localtime()) action_code = CFG_WEBCOMMENT_ACTION_CODE['VOTE'] query = """INSERT INTO cmtACTIONHISTORY (id_cmtRECORDCOMMENT, id_bibrec, id_user, client_host, action_time, action_code) VALUES (%s, NULL ,%s, inet_aton(%s), %s, %s)""" params = (cmt_id, uid, client_ip_address, action_date, action_code) run_sql(query, params) return query_record_useful_review(cmt_id, value) else: return 0 def check_user_can_comment(recID, client_ip_address, uid=-1): """ Check if a user hasn't already commented within the last seconds time limit: CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_COMMENTS_IN_SECONDS @param recID: record id - @param client_ip_address: IP => use: str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + @param client_ip_address: IP => use: str(req.remote_ip) @param uid: user id, as given by invenio.webuser.getUid(req) """ recID = wash_url_argument(recID, 'int') client_ip_address = wash_url_argument(client_ip_address, 'str') uid = wash_url_argument(uid, 'int') max_action_time = time.time() - CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_COMMENTS_IN_SECONDS max_action_time = convert_datestruct_to_datetext(time.localtime(max_action_time)) action_code = CFG_WEBCOMMENT_ACTION_CODE['ADD_COMMENT'] query = """SELECT id_bibrec FROM cmtACTIONHISTORY WHERE id_bibrec=%s AND action_code=%s AND action_time>%s """ params = (recID, action_code, max_action_time) if uid < 0: query += " AND client_host=inet_aton(%s)" params += (client_ip_address,) else: query += " AND id_user=%s" params += (uid,) res = run_sql(query, params) return len(res) == 0 def check_user_can_review(recID, client_ip_address, uid=-1): """ Check if a user hasn't already reviewed within the last seconds time limit: CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_REVIEWS_IN_SECONDS @param cmt_id: comment id - @param client_ip_address: IP => use: str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + @param client_ip_address: IP => use: str(req.remote_ip) @param uid: user id, as given by invenio.webuser.getUid(req) """ action_code = CFG_WEBCOMMENT_ACTION_CODE['ADD_REVIEW'] query = """SELECT id_bibrec FROM cmtACTIONHISTORY WHERE id_bibrec=%s AND action_code=%s """ params = (recID, action_code) if uid < 0: query += " AND client_host=inet_aton(%s)" params += (client_ip_address,) else: query += " AND id_user=%s" params += (uid,) res = run_sql(query, params) return len(res) == 0 def check_user_can_vote(cmt_id, client_ip_address, uid=-1): """ Checks if a user hasn't already voted @param cmt_id: comment id - @param client_ip_address: IP => use: str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + @param client_ip_address: IP => use: str(req.remote_ip) @param uid: user id, as given by invenio.webuser.getUid(req) """ cmt_id = wash_url_argument(cmt_id, 'int') client_ip_address = wash_url_argument(client_ip_address, 'str') uid = wash_url_argument(uid, 'int') query = """SELECT id_cmtRECORDCOMMENT FROM cmtACTIONHISTORY WHERE id_cmtRECORDCOMMENT=%s""" params = (cmt_id,) if uid < 0: query += " AND client_host=inet_aton(%s)" params += (client_ip_address,) else: query += " AND id_user=%s" params += (uid, ) res = run_sql(query, params) return (len(res) == 0) def perform_request_report(cmt_id, client_ip_address, uid=-1): """ Report a comment/review for inappropriate content. Will send an email to the administrator if number of reports is a multiple of CFG_WEBCOMMENT_NB_REPORTS_BEFORE_SEND_EMAIL_TO_ADMIN @param cmt_id: comment id @return: integer 1 if successful, integer 0 if not. -2 if comment does not exist """ cmt_id = wash_url_argument(cmt_id, 'int') if cmt_id <= 0: return 0 (query_res, nb_abuse_reports) = query_record_report_this(cmt_id) if query_res == 0: return 0 elif query_res == -2: return -2 if not(check_user_can_report(cmt_id, client_ip_address, uid)): return 0 action_date = convert_datestruct_to_datetext(time.localtime()) action_code = CFG_WEBCOMMENT_ACTION_CODE['REPORT_ABUSE'] query = """INSERT INTO cmtACTIONHISTORY (id_cmtRECORDCOMMENT, id_bibrec, id_user, client_host, action_time, action_code) VALUES (%s, NULL, %s, inet_aton(%s), %s, %s)""" params = (cmt_id, uid, client_ip_address, action_date, action_code) run_sql(query, params) if nb_abuse_reports % CFG_WEBCOMMENT_NB_REPORTS_BEFORE_SEND_EMAIL_TO_ADMIN == 0: (cmt_id2, id_bibrec, id_user, cmt_body, cmt_date, cmt_star, cmt_vote, cmt_nb_votes_total, cmt_title, cmt_reported) = query_get_comment(cmt_id) (user_nb_abuse_reports, user_votes, user_nb_votes_total) = query_get_user_reports_and_votes(int(id_user)) (nickname, user_email, last_login) = query_get_user_contact_info(id_user) from_addr = '%s Alert Engine <%s>' % (CFG_SITE_NAME, CFG_WEBALERT_ALERT_ENGINE_EMAIL) to_addr = CFG_SITE_ADMIN_EMAIL subject = "A comment has been reported as inappropriate by a user" body = ''' The following comment has been reported a total of %(cmt_reported)s times. Author: nickname = %(nickname)s email = %(user_email)s user_id = %(uid)s This user has: total number of reports = %(user_nb_abuse_reports)s %(votes)s Comment: comment_id = %(cmt_id)s record_id = %(id_bibrec)s date written = %(cmt_date)s nb reports = %(cmt_reported)s %(review_stuff)s body = ---start body--- %(cmt_body)s ---end body--- Please go to the WebComment Admin interface %(comment_admin_link)s to delete this message if necessary. A warning will be sent to the user in question.''' % \ { 'cfg-report_max' : CFG_WEBCOMMENT_NB_REPORTS_BEFORE_SEND_EMAIL_TO_ADMIN, 'nickname' : nickname, 'user_email' : user_email, 'uid' : id_user, 'user_nb_abuse_reports' : user_nb_abuse_reports, 'user_votes' : user_votes, 'votes' : CFG_WEBCOMMENT_ALLOW_REVIEWS and \ "total number of positive votes\t= %s\n\t\t\t\ttotal number of negative votes\t= %s" % \ (user_votes, (user_nb_votes_total - user_votes)) or "\n", 'cmt_id' : cmt_id, 'id_bibrec' : id_bibrec, 'cmt_date' : cmt_date, 'cmt_reported' : cmt_reported, 'review_stuff' : CFG_WEBCOMMENT_ALLOW_REVIEWS and \ "star score\t\t= %s\n\t\t\treview title\t\t= %s" % (cmt_star, cmt_title) or "", 'cmt_body' : cmt_body, 'comment_admin_link' : CFG_SITE_URL + "/admin/webcomment/webcommentadmin.py", 'user_admin_link' : "user_admin_link" #! FIXME } #FIXME to be added to email when websession module is over: #If you wish to ban the user, you can do so via the User Admin Panel %(user_admin_link)s. send_email(from_addr, to_addr, subject, body) return 1 def check_user_can_report(cmt_id, client_ip_address, uid=-1): """ Checks if a user hasn't already reported a comment @param cmt_id: comment id - @param client_ip_address: IP => use: str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + @param client_ip_address: IP => use: str(req.remote_ip) @param uid: user id, as given by invenio.webuser.getUid(req) """ cmt_id = wash_url_argument(cmt_id, 'int') client_ip_address = wash_url_argument(client_ip_address, 'str') uid = wash_url_argument(uid, 'int') query = """SELECT id_cmtRECORDCOMMENT FROM cmtACTIONHISTORY WHERE id_cmtRECORDCOMMENT=%s""" params = (uid,) if uid < 0: query += " AND client_host=inet_aton(%s)" params += (client_ip_address,) else: query += " AND id_user=%s" params += (uid,) res = run_sql(query, params) return (len(res) == 0) def query_get_user_contact_info(uid): """ Get the user contact information @return: tuple (nickname, email, last_login), if none found return () Note: for the moment, if no nickname, will return email address up to the '@' """ query1 = """SELECT nickname, email, DATE_FORMAT(last_login, '%%Y-%%m-%%d %%H:%%i:%%s') FROM user WHERE id=%s""" params1 = (uid,) res1 = run_sql(query1, params1) if res1: return res1[0] else: return () def query_get_user_reports_and_votes(uid): """ Retrieve total number of reports and votes of a particular user @param uid: user id @return: tuple (total_nb_reports, total_nb_votes_yes, total_nb_votes_total) if none found return () """ query1 = """SELECT nb_votes_yes, nb_votes_total, nb_abuse_reports FROM cmtRECORDCOMMENT WHERE id_user=%s""" params1 = (uid,) res1 = run_sql(query1, params1) if len(res1) == 0: return () nb_votes_yes = nb_votes_total = nb_abuse_reports = 0 for cmt_tuple in res1: nb_votes_yes += int(cmt_tuple[0]) nb_votes_total += int(cmt_tuple[1]) nb_abuse_reports += int(cmt_tuple[2]) return (nb_abuse_reports, nb_votes_yes, nb_votes_total) def query_get_comment(comID): """ Get all fields of a comment @param comID: comment id @return: tuple (comID, id_bibrec, id_user, body, date_creation, star_score, nb_votes_yes, nb_votes_total, title, nb_abuse_reports) if none found return () """ query1 = """SELECT id, id_bibrec, id_user, body, DATE_FORMAT(date_creation, '%%Y-%%m-%%d %%H:%%i:%%s'), star_score, nb_votes_yes, nb_votes_total, title, nb_abuse_reports FROM cmtRECORDCOMMENT WHERE id=%s""" params1 = (comID,) res1 = run_sql(query1, params1) if len(res1)>0: return res1[0] else: return () def query_record_report_this(comID): """ Increment the number of reports for a comment @param comID: comment id @return: tuple (success, new_total_nb_reports_for_this_comment) where success is integer 1 if success, integer 0 if not, -2 if comment does not exist """ #retrieve nb_abuse_reports query1 = "SELECT nb_abuse_reports FROM cmtRECORDCOMMENT WHERE id=%s" params1 = (comID,) res1 = run_sql(query1, params1) if len(res1) == 0: return (-2, 0) #increment and update nb_abuse_reports = int(res1[0][0]) + 1 query2 = "UPDATE cmtRECORDCOMMENT SET nb_abuse_reports=%s WHERE id=%s" params2 = (nb_abuse_reports, comID) res2 = run_sql(query2, params2) return (int(res2), nb_abuse_reports) def query_record_useful_review(comID, value): """ private funciton Adjust the number of useful votes and number of total votes for a comment. @param comID: comment id @param value: +1 or -1 @return: integer 1 if successful, integer 0 if not """ # retrieve nb_useful votes query1 = "SELECT nb_votes_total, nb_votes_yes FROM cmtRECORDCOMMENT WHERE id=%s" params1 = (comID,) res1 = run_sql(query1, params1) if len(res1)==0: return 0 # modify and insert new nb_useful votes nb_votes_yes = int(res1[0][1]) if value >= 1: nb_votes_yes = int(res1[0][1]) + 1 nb_votes_total = int(res1[0][0]) + 1 query2 = "UPDATE cmtRECORDCOMMENT SET nb_votes_total=%s, nb_votes_yes=%s WHERE id=%s" params2 = (nb_votes_total, nb_votes_yes, comID) res2 = run_sql(query2, params2) return int(res2) def query_retrieve_comments_or_remarks (recID, display_order='od', display_since='0000-00-00 00:00:00', ranking=0): """ Private function Retrieve tuple of comments or remarks from the database @param recID: record id @param display_order: hh = highest helpful score lh = lowest helpful score hs = highest star score ls = lowest star score od = oldest date nd = newest date @param display_since: datetime, e.g. 0000-00-00 00:00:00 @param ranking: boolean, enabled if reviews, disabled for comments @param full_reviews_p: boolean, filter out empty reviews (with score only) if False @return: tuple of comment where comment is tuple (nickname, uid, date_creation, body, id) if ranking disabled or tuple (nickname, uid, date_creation, body, nb_votes_yes, nb_votes_total, star_score, title, id) Note: for the moment, if no nickname, will return email address up to '@' """ display_since = calculate_start_date(display_since) order_dict = { 'hh' : "cmt.nb_votes_yes/(cmt.nb_votes_total+1) DESC, cmt.date_creation DESC ", 'lh' : "cmt.nb_votes_yes/(cmt.nb_votes_total+1) ASC, cmt.date_creation ASC ", 'ls' : "cmt.star_score ASC, cmt.date_creation DESC ", 'hs' : "cmt.star_score DESC, cmt.date_creation DESC ", 'od' : "cmt.date_creation ASC ", 'nd' : "cmt.date_creation DESC " } # Ranking only done for comments and when allowed if ranking and recID > 0: try: display_order = order_dict[display_order] except: display_order = order_dict['od'] else: # in case of recID > 0 => external record => no ranking! ranking = 0 try: if display_order[-1] == 'd': display_order = order_dict[display_order] else: display_order = order_dict['od'] except: display_order = order_dict['od'] query = """SELECT user.nickname, cmt.id_user, DATE_FORMAT(cmt.date_creation, '%%%%Y-%%%%m-%%%%d %%%%H:%%%%i:%%%%s'), cmt.body, %(ranking)s cmt.id FROM %(table)s cmt LEFT JOIN user ON user.id=cmt.id_user WHERE %(id_bibrec)s=%%s %(ranking_only)s %(display_since)s ORDER BY %%s """ % {'ranking' : ranking and ' cmt.nb_votes_yes, cmt.nb_votes_total, cmt.star_score, cmt.title, ' or '', 'ranking_only' : ranking and ' AND cmt.star_score>0 ' or ' AND cmt.star_score=0 ', 'id_bibrec' : recID > 0 and 'cmt.id_bibrec' or 'cmt.id_bibrec_or_bskEXTREC', 'table' : recID > 0 and 'cmtRECORDCOMMENT' or 'bskRECORDCOMMENT', 'display_since' : display_since == '0000-00-00 00:00:00' and ' ' or 'AND cmt.date_creation>=\'%s\' ' % display_since} params = (recID, display_order) res = run_sql(query, params) if res: return res return () def query_add_comment_or_remark(reviews=0, recID=0, uid=-1, msg="", note="", score=0, priority=0, client_ip_address='', editor_type='textarea'): """ Private function Insert a comment/review or remarkinto the database @param recID: record id @param uid: user id @param msg: comment body @param note: comment title @param score: review star score @param priority: remark priority #!FIXME @param editor_type: the kind of editor used to submit the comment: 'textarea', 'fckeditor' @return: integer >0 representing id if successful, integer 0 if not """ current_date = calculate_start_date('0d') #change utf-8 message into general unicode msg = msg.decode('utf-8') note = note.decode('utf-8') #change general unicode back to utf-8 msg = msg.encode('utf-8') note = note.encode('utf-8') if editor_type == 'fckeditor': # Here we remove the line feeds introduced by FCKeditor (they # have no meaning for the user) and replace the HTML line # breaks by linefeeds, so that we are close to an input that # would be done without the FCKeditor. That's much better if a # reply to a comment is made with a browser that does not # support FCKeditor. msg = msg.replace('\n', '').replace('\r', '').replace('<br />', '\n') query = """INSERT INTO cmtRECORDCOMMENT (id_bibrec, id_user, body, date_creation, star_score, nb_votes_total, title) VALUES (%s, %s, %s, %s, %s, %s, %s)""" params = (recID, uid, msg, current_date, score, 0, note) res = run_sql(query, params) if res: action_code = CFG_WEBCOMMENT_ACTION_CODE[reviews and 'ADD_REVIEW' or 'ADD_COMMENT'] action_time = convert_datestruct_to_datetext(time.localtime()) query2 = """INSERT INTO cmtACTIONHISTORY (id_cmtRECORDCOMMENT, id_bibrec, id_user, client_host, action_time, action_code) VALUES ('', %s, %s, inet_aton(%s), %s, %s)""" params2 = (recID, uid, client_ip_address, action_time, action_code) run_sql(query2, params2) return int(res) def calculate_start_date_old(display_since): """ Private function Returns the datetime of display_since argument in MYSQL datetime format calculated according to the local time. @param display_since: = all= no filtering nd = n days ago nw = n weeks ago nm = n months ago ny = n years ago where n is a single digit number @return: string of wanted datetime. If 'all' given as argument, will return datetext_default datetext_default is defined in miscutils/lib/dateutils and equals 0000-00-00 00:00:00 => MySQL format If bad arguement given, will return datetext_default """ # time type and seconds coefficients time_types = {'d':0, 'w':0, 'm':0, 'y':0} ## verify argument # argument wrong size if (display_since==(None or 'all')) or (len(display_since) > 2): return datetext_default try: nb = int(display_since[0]) except: return datetext_default if str(display_since[1]) in time_types: time_type = str(display_since[1]) else: return datetext_default ## calculate date # initialize the coef if time_type == 'w': time_types[time_type] = 7 else: time_types[time_type] = 1 start_time = time.localtime() start_time = (start_time[0] - nb*time_types['y'], start_time[1] - nb*time_types['m'], start_time[2] - nb*time_types['d'] - nb*time_types['w'], start_time[3], start_time[4], start_time[5], start_time[6], start_time[7], start_time[8]) return convert_datestruct_to_datetext(start_time) def calculate_start_date(display_since): time_types = {'d':0, 'w':0, 'm':0, 'y':0} today = datetime.today() try: nb = int(display_since[:-1]) except: return datetext_default if (display_since==(None or 'all')): return datetext_default if str(display_since[-1]) in time_types: time_type = str(display_since[-1]) else: return datetext_default # year if time_type == 'y': if (int(display_since[:-1]) > today.year - 1) or (int(display_since[:-1]) < 1): # 1 < nb years < 2008 return datetext_default else: final_nb_year = today.year - nb yesterday = today.replace(year=final_nb_year) # month elif time_type == 'm': # to convert nb of monthes in years nb_year = nb / 12 # nb_year = number of year to substract nb = nb % 12 if nb > today.month-1: # ex: july(07)-9 monthes = -1year -3monthes nb_year += 1 nb_month = 12 - (today.month % nb) else: nb_month = today.month - nb final_nb_year = today.year - nb_year # final_nb_year = number of year to print yesterday = today.replace(year=final_nb_year, month=nb_month) # week elif time_type == 'w': delta = timedelta(weeks=nb) yesterday = today - delta # day elif time_type == 'd': delta = timedelta(days=nb) yesterday = today - delta return yesterday.strftime("%Y-%m-%d %H:%M:%S") def count_comments(recID): """ Returns the number of comments made on a record. """ recID = int(recID) query = """SELECT count(id) FROM cmtRECORDCOMMENT WHERE id_bibrec=%s AND star_score=0""" return run_sql(query, (recID,))[0][0] def count_reviews(recID): """ Returns the number of reviews made on a record. """ recID = int(recID) query = """SELECT count(id) FROM cmtRECORDCOMMENT WHERE id_bibrec=%s AND star_score>0""" return run_sql(query, (recID,))[0][0] def get_first_comments_or_remarks(recID=-1, ln=CFG_SITE_LANG, nb_comments='all', nb_reviews='all', voted=-1, reported=-1): """ Gets nb number comments/reviews or remarks. In the case of comments, will get both comments and reviews Comments and remarks sorted by most recent date, reviews sorted by highest helpful score @param recID: record id @param ln: language @param nb: number of comment/reviews or remarks to get @param voted: 1 if user has voted for a remark @param reported: 1 if user has reported a comment or review @return: if comment, tuple (comments, reviews) both being html of first nb comments/reviews if remark, tuple (remakrs, None) """ warnings = [] errors = [] voted = wash_url_argument(voted, 'int') reported = wash_url_argument(reported, 'int') ## check recID argument if type(recID) is not int: return () if recID >= 1: #comment or review. NB: suppressed reference to basket (handled in webbasket) if CFG_WEBCOMMENT_ALLOW_REVIEWS: res_reviews = query_retrieve_comments_or_remarks(recID=recID, display_order="hh", ranking=1) nb_res_reviews = len(res_reviews) ## check nb argument if type(nb_reviews) is int and nb_reviews < len(res_reviews): first_res_reviews = res_reviews[:nb_reviews] else: first_res_reviews = res_reviews if CFG_WEBCOMMENT_ALLOW_COMMENTS: res_comments = query_retrieve_comments_or_remarks(recID=recID, display_order="od", ranking=0) nb_res_comments = len(res_comments) ## check nb argument if type(nb_comments) is int and nb_comments < len(res_comments): first_res_comments = res_comments[:nb_comments] else: first_res_comments = res_comments else: #error errors.append(('ERR_WEBCOMMENT_RECID_INVALID', recID)) #!FIXME dont return error anywhere since search page # comment if recID >= 1: comments = reviews = "" if reported > 0: warnings.append(('WRN_WEBCOMMENT_FEEDBACK_RECORDED_GREEN_TEXT',)) elif reported == 0: warnings.append(('WRN_WEBCOMMENT_FEEDBACK_NOT_RECORDED_RED_TEXT',)) if CFG_WEBCOMMENT_ALLOW_COMMENTS: # normal comments comments = webcomment_templates.tmpl_get_first_comments_without_ranking(recID, ln, first_res_comments, nb_res_comments, warnings) if CFG_WEBCOMMENT_ALLOW_REVIEWS: # ranked comments #calculate average score avg_score = calculate_avg_score(res_reviews) if voted > 0: warnings.append(('WRN_WEBCOMMENT_FEEDBACK_RECORDED_GREEN_TEXT',)) elif voted == 0: warnings.append(('WRN_WEBCOMMENT_FEEDBACK_NOT_RECORDED_RED_TEXT',)) reviews = webcomment_templates.tmpl_get_first_comments_with_ranking(recID, ln, first_res_reviews, nb_res_reviews, avg_score, warnings) return (comments, reviews) # remark else: return(webcomment_templates.tmpl_get_first_remarks(first_res_comments, ln, nb_res_comments), None) def calculate_avg_score(res): """ private function Calculate the avg score of reviews present in res @param res: tuple of tuple returned from query_retrieve_comments_or_remarks @return: a float of the average score rounded to the closest 0.5 """ c_star_score = 6 avg_score = 0.0 nb_reviews = 0 for comment in res: if comment[c_star_score] > 0: avg_score += comment[c_star_score] nb_reviews += 1 if nb_reviews == 0: return 0.0 avg_score = avg_score / nb_reviews avg_score_unit = avg_score - math.floor(avg_score) if avg_score_unit < 0.25: avg_score = math.floor(avg_score) elif avg_score_unit > 0.75: avg_score = math.floor(avg_score) + 1 else: avg_score = math.floor(avg_score) + 0.5 if avg_score > 5: avg_score = 5.0 return avg_score def perform_request_add_comment_or_remark(recID=0, uid=-1, action='DISPLAY', ln=CFG_SITE_LANG, msg=None, score=None, note=None, priority=None, reviews=0, comID=-1, client_ip_address=None, editor_type='textarea', can_attach_files=False): """ Add a comment/review or remark @param recID: record id @param uid: user id @param action: 'DISPLAY' to display add form 'SUBMIT' to submit comment once form is filled 'REPLY' to reply to an existing comment @param ln: language @param msg: the body of the comment/review or remark @param score: star score of the review @param note: title of the review @param priority: priority of remark (int) @param reviews: boolean, if enabled will add a review, if disabled will add a comment @param comID: if replying, this is the comment id of the commetn are replying to @param editor_type: the kind of editor/input used for the comment: 'textarea', 'fckeditor' @param can_attach_files: if user can attach files to comments or not @return: html add form if action is display or reply html successful added form if action is submit """ warnings = [] errors = [] actions = ['DISPLAY', 'REPLY', 'SUBMIT'] _ = gettext_set_language(ln) ## check arguments check_recID_is_in_range(recID, warnings, ln) if uid <= 0: errors.append(('ERR_WEBCOMMENT_UID_INVALID', uid)) return ('', errors, warnings) user_contact_info = query_get_user_contact_info(uid) nickname = '' if user_contact_info: if user_contact_info[0]: nickname = user_contact_info[0] # show the form if action == 'DISPLAY': if reviews and CFG_WEBCOMMENT_ALLOW_REVIEWS: return (webcomment_templates.tmpl_add_comment_form_with_ranking(recID, uid, nickname, ln, msg, score, note, warnings, can_attach_files=can_attach_files), errors, warnings) elif not reviews and CFG_WEBCOMMENT_ALLOW_COMMENTS: return (webcomment_templates.tmpl_add_comment_form(recID, uid, nickname, ln, msg, warnings, can_attach_files=can_attach_files), errors, warnings) else: errors.append(('ERR_WEBCOMMENT_COMMENTS_NOT_ALLOWED',)) elif action == 'REPLY': if reviews and CFG_WEBCOMMENT_ALLOW_REVIEWS: errors.append(('ERR_WEBCOMMENT_REPLY_REVIEW',)) return (webcomment_templates.tmpl_add_comment_form_with_ranking(recID, uid, nickname, ln, msg, score, note, warnings, can_attach_files=can_attach_files), errors, warnings) elif not reviews and CFG_WEBCOMMENT_ALLOW_COMMENTS: textual_msg = msg if comID > 0: comment = query_get_comment(comID) if comment: user_info = get_user_info(comment[2]) if user_info: date_creation = convert_datetext_to_dategui(str(comment[4])) # Build two msg: one mostly textual, the other one with HTML markup, for the FCKeditor. msg = _("%(x_name)s wrote on %(x_date)s:")% {'x_name': user_info[2], 'x_date': date_creation} textual_msg = msg # 1 For FCKeditor input msg += '<br /><br />' msg += comment[3] msg = email_quote_txt(text=msg) msg = email_quoted_txt2html(text=msg) msg = '<br/>' + msg + '<br/>' # 2 For textarea input textual_msg += "\n\n" textual_msg += comment[3] textual_msg = email_quote_txt(text=textual_msg) return (webcomment_templates.tmpl_add_comment_form(recID, uid, nickname, ln, msg, warnings, textual_msg, can_attach_files=can_attach_files), errors, warnings) else: errors.append(('ERR_WEBCOMMENT_COMMENTS_NOT_ALLOWED',)) # check before submitting form elif action == 'SUBMIT': if reviews and CFG_WEBCOMMENT_ALLOW_REVIEWS: if note.strip() in ["", "None"] and not CFG_WEBCOMMENT_ALLOW_SHORT_REVIEWS: warnings.append(('WRN_WEBCOMMENT_ADD_NO_TITLE',)) if score == 0 or score > 5: warnings.append(("WRN_WEBCOMMENT_ADD_NO_SCORE",)) if msg.strip() in ["", "None"] and not CFG_WEBCOMMENT_ALLOW_SHORT_REVIEWS: warnings.append(('WRN_WEBCOMMENT_ADD_NO_BODY',)) # if no warnings, submit if len(warnings) == 0: if reviews: if check_user_can_review(recID, client_ip_address, uid): success = query_add_comment_or_remark(reviews, recID=recID, uid=uid, msg=msg, note=note, score=score, priority=0, client_ip_address=client_ip_address, editor_type=editor_type) else: warnings.append('WRN_WEBCOMMENT_CANNOT_REVIEW_TWICE') success = 1 else: if check_user_can_comment(recID, client_ip_address, uid): success = query_add_comment_or_remark(reviews, recID=recID, uid=uid, msg=msg, note=note, score=score, priority=0, client_ip_address=client_ip_address, editor_type=editor_type) else: warnings.append('WRN_WEBCOMMENT_TIMELIMIT') success = 1 if success > 0: if CFG_WEBCOMMENT_ADMIN_NOTIFICATION_LEVEL > 0: notify_admin_of_new_comment(comID=success) return (webcomment_templates.tmpl_add_comment_successful(recID, ln, reviews, warnings), errors, warnings) else: errors.append(('ERR_WEBCOMMENT_DB_INSERT_ERROR')) # if are warnings or if inserting comment failed, show user where warnings are if reviews and CFG_WEBCOMMENT_ALLOW_REVIEWS: return (webcomment_templates.tmpl_add_comment_form_with_ranking(recID, uid, nickname, ln, msg, score, note, warnings, can_attach_files=can_attach_files), errors, warnings) else: return (webcomment_templates.tmpl_add_comment_form(recID, uid, nickname, ln, msg, warnings, can_attach_files=can_attach_files), errors, warnings) # unknown action send to display else: warnings.append(('WRN_WEBCOMMENT_ADD_UNKNOWN_ACTION',)) if reviews and CFG_WEBCOMMENT_ALLOW_REVIEWS: return (webcomment_templates.tmpl_add_comment_form_with_ranking(recID, uid, ln, msg, score, note, warnings, can_attach_files=can_attach_files), errors, warnings) else: return (webcomment_templates.tmpl_add_comment_form(recID, uid, ln, msg, warnings, can_attach_files=can_attach_files), errors, warnings) return ('', errors, warnings) def notify_admin_of_new_comment(comID): """ Sends an email to the admin with details regarding comment with ID = comID """ comment = query_get_comment(comID) if len(comment) > 0: (comID2, id_bibrec, id_user, body, date_creation, star_score, nb_votes_yes, nb_votes_total, title, nb_abuse_reports) = comment else: return user_info = query_get_user_contact_info(id_user) if len(user_info) > 0: (nickname, email, last_login) = user_info if not len(nickname) > 0: nickname = email.split('@')[0] else: nickname = email = last_login = "ERROR: Could not retrieve" from invenio.search_engine import print_record record = print_record(recID=id_bibrec, format='hs') review_stuff = ''' Star score = %s Title = %s''' % (star_score, title) out = ''' The following %(comment_or_review)s has just been posted (%(date)s). AUTHOR: Nickname = %(nickname)s Email = %(email)s User ID = %(uid)s RECORD CONCERNED: Record ID = %(recID)s Record = <!-- start record details --> %(record_details)s <!-- end record details --> %(comment_or_review_caps)s: %(comment_or_review)s ID = %(comID)s %(review_stuff)s Body = <!-- start body --> %(body)s <!-- end body --> ADMIN OPTIONS: To delete comment go to %(siteurl)s/admin/webcomment/webcommentadmin.py/delete?comid=%(comID)s ''' % \ { 'comment_or_review' : star_score > 0 and 'review' or 'comment', 'comment_or_review_caps': star_score > 0 and 'REVIEW' or 'COMMENT', 'date' : date_creation, 'nickname' : nickname, 'email' : email, 'uid' : id_user, 'recID' : id_bibrec, 'record_details' : record, 'comID' : comID2, 'review_stuff' : star_score > 0 and review_stuff or "", 'body' : body.replace('<br />','\n'), 'siteurl' : CFG_SITE_URL } from_addr = '%s WebComment <%s>' % (CFG_SITE_NAME, CFG_WEBALERT_ALERT_ENGINE_EMAIL) to_addr = CFG_SITE_ADMIN_EMAIL subject = "A new comment/review has just been posted" send_email(from_addr, to_addr, subject, out) def check_recID_is_in_range(recID, warnings=[], ln=CFG_SITE_LANG): """ Check that recID is >= 0 Append error messages to errors listi @param recID: record id @param warnings: the warnings list of the calling function @return: tuple (boolean, html) where boolean (1=true, 0=false) and html is the body of the page to display if there was a problem """ # Make errors into a list if needed if type(warnings) is not list: errors = [warnings] try: recID = int(recID) except: pass if type(recID) is int: if recID > 0: from invenio.search_engine import record_exists success = record_exists(recID) if success == 1: return (1,"") else: warnings.append(('ERR_WEBCOMMENT_RECID_INEXISTANT', recID)) return (0, webcomment_templates.tmpl_record_not_found(status='inexistant', recID=recID, ln=ln)) elif recID == 0: warnings.append(('ERR_WEBCOMMENT_RECID_MISSING',)) return (0, webcomment_templates.tmpl_record_not_found(status='missing', recID=recID, ln=ln)) else: warnings.append(('ERR_WEBCOMMENT_RECID_INVALID', recID)) return (0, webcomment_templates.tmpl_record_not_found(status='invalid', recID=recID, ln=ln)) else: warnings.append(('ERR_WEBCOMMENT_RECID_NAN', recID)) return (0, webcomment_templates.tmpl_record_not_found(status='nan', recID=recID, ln=ln)) def check_int_arg_is_in_range(value, name, errors, gte_value, lte_value=None): """ Check that variable with name 'name' >= gte_value and optionally <= lte_value Append error messages to errors list @param value: variable value @param name: variable name @param errors: list of error tuples (error_id, value) @param gte_value: greater than or equal to value @param lte_value: less than or equal to value @return: boolean (1=true, 0=false) """ # Make errors into a list if needed if type(errors) is not list: errors = [errors] if type(value) is not int or type(gte_value) is not int: errors.append(('ERR_WEBCOMMENT_PROGRAMNING_ERROR',)) return 0 if type(value) is not int: errors.append(('ERR_WEBCOMMENT_ARGUMENT_NAN', value)) return 0 if value < gte_value: errors.append(('ERR_WEBCOMMENT_ARGUMENT_INVALID', value)) return 0 if lte_value: if type(lte_value) is not int: errors.append(('ERR_WEBCOMMENT_PROGRAMNING_ERROR',)) return 0 if value > lte_value: errors.append(('ERR_WEBCOMMENT_ARGUMENT_INVALID', value)) return 0 return 1 def get_mini_reviews(recid, ln=CFG_SITE_LANG): """ Returns the web controls to add reviews to a record from the detailed record pages mini-panel. @param recid: the id of the displayed record @param ln: the user's language """ if CFG_WEBCOMMENT_ALLOW_SHORT_REVIEWS: action = 'SUBMIT' else: action = 'DISPLAY' reviews = query_retrieve_comments_or_remarks(recid, ranking=1) return webcomment_templates.tmpl_mini_review(recid, ln, action=action, avg_score=calculate_avg_score(reviews), nb_comments_total=len(reviews)) def check_user_can_view_comments(user_info, recid): """Check if the user is authorized to view comments for given recid. Returns the same type as acc_authorize_action """ # Check user can view the record itself first (auth_code, auth_msg) = check_user_can_view_record(user_info, recid) if auth_code: return (auth_code, auth_msg) # Check if user can view the comments ## But first can we find an authorization for this case action, ## for this collection? record_primary_collection = guess_primary_collection_of_a_record(recid) return acc_authorize_action(user_info, 'viewcomment', authorized_if_no_roles=True, collection=record_primary_collection) def check_user_can_send_comments(user_info, recid): """Check if the user is authorized to comment the given recid. This function does not check that user can view the record or view the comments Returns the same type as acc_authorize_action """ ## First can we find an authorization for this case, action + collection record_primary_collection = guess_primary_collection_of_a_record(recid) return acc_authorize_action(user_info, 'sendcomment', authorized_if_no_roles=True, collection=record_primary_collection) def check_user_can_attach_file_to_comments(user_info, recid): """Check if the user is authorized to attach a file to comments for given recid. This function does not check that user can view the comments or send comments. Returns the same type as acc_authorize_action """ ## First can we find an authorization for this case action, for ## this collection? record_primary_collection = guess_primary_collection_of_a_record(recid) return acc_authorize_action(user_info, 'attachcommentfile', authorized_if_no_roles=True, collection=record_primary_collection) diff --git a/modules/webcomment/lib/webcomment_webinterface.py b/modules/webcomment/lib/webcomment_webinterface.py index aeb6c113c..97550c551 100644 --- a/modules/webcomment/lib/webcomment_webinterface.py +++ b/modules/webcomment/lib/webcomment_webinterface.py @@ -1,609 +1,605 @@ # -*- coding: utf-8 -*- ## Comments and reviews for records. ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Comments and reviews for records: web interface """ __lastupdated__ = """$Date$""" __revision__ = """$Id$""" import urllib from invenio.webcomment import check_recID_is_in_range, \ perform_request_display_comments_or_remarks, \ perform_request_add_comment_or_remark, \ perform_request_vote, \ perform_request_report, \ check_user_can_attach_file_to_comments, \ check_user_can_view_comments, \ check_user_can_send_comments from invenio.config import \ CFG_PREFIX, \ CFG_SITE_LANG, \ CFG_SITE_URL, \ CFG_SITE_SECURE_URL, \ CFG_WEBCOMMENT_ALLOW_COMMENTS,\ CFG_WEBCOMMENT_ALLOW_REVIEWS from invenio.webuser import getUid, page_not_authorized, isGuestUser, collect_user_info from invenio.webpage import page, pageheaderonly, pagefooteronly from invenio.search_engine import create_navtrail_links, \ guess_primary_collection_of_a_record, \ get_colID -from invenio.urlutils import get_client_ip_address, \ - redirect_to_url, \ +from invenio.urlutils import redirect_to_url, \ make_canonical_urlargd from invenio.messages import gettext_set_language from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory from invenio.websearchadminlib import get_detailed_page_tabs from invenio.access_control_config import VIEWRESTRCOLL from invenio.access_control_mailcookie import mail_cookie_create_authorize_action import invenio.template webstyle_templates = invenio.template.load('webstyle') websearch_templates = invenio.template.load('websearch') try: from invenio.fckeditor_invenio_connector import FCKeditorConnectorInvenio fckeditor_available = True except ImportError, e: fckeditor_available = False import os -try: - from mod_python import apache -except ImportError: - pass +from invenio import webinterface_handler_wsgi_utils as apache from invenio.bibdocfile import stream_file class WebInterfaceCommentsPages(WebInterfaceDirectory): """Defines the set of /comments pages.""" _exports = ['', 'display', 'add', 'vote', 'report', 'index', 'attachments'] def __init__(self, recid=-1, reviews=0): self.recid = recid self.discussion = reviews # 0:comments, 1:reviews self.attachments = WebInterfaceCommentsFiles(recid, reviews) def index(self, req, form): """ Redirects to display function """ return self.display(req, form) def display(self, req, form): """ Display comments (reviews if enabled) associated with record having id recid where recid>0. This function can also be used to display remarks associated with basket having id recid where recid<-99. @param ln: language @param recid: record id, integer @param do: display order hh = highest helpful score, review only lh = lowest helpful score, review only hs = highest star score, review only ls = lowest star score, review only od = oldest date nd = newest date @param ds: display since all= no filtering by date nd = n days ago nw = n weeks ago nm = n months ago ny = n years ago where n is a single digit integer between 0 and 9 @param nb: number of results per page @param p: results page @param voted: boolean, active if user voted for a review, see vote function @param reported: int, active if user reported a certain comment/review, see report function @param reviews: boolean, enabled for reviews, disabled for comments @return: the full html page. """ argd = wash_urlargd(form, {'do': (str, "od"), 'ds': (str, "all"), 'nb': (int, 100), 'p': (int, 1), 'voted': (int, -1), 'reported': (int, -1), }) _ = gettext_set_language(argd['ln']) uid = getUid(req) user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)}) target = '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \ CFG_SITE_URL + user_info['uri']}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg) can_send_comments = False (auth_code, auth_msg) = check_user_can_send_comments(user_info, self.recid) if not auth_code: can_send_comments = True can_attach_files = False (auth_code, auth_msg) = check_user_can_attach_file_to_comments(user_info, self.recid) if not auth_code: can_attach_files = True check_warnings = [] (ok, problem) = check_recID_is_in_range(self.recid, check_warnings, argd['ln']) if ok: (body, errors, warnings) = perform_request_display_comments_or_remarks(recID=self.recid, display_order=argd['do'], display_since=argd['ds'], nb_per_page=argd['nb'], page=argd['p'], ln=argd['ln'], voted=argd['voted'], reported=argd['reported'], reviews=self.discussion, uid=uid, can_send_comments=can_send_comments, can_attach_files=can_attach_files) unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(self.recid)), self.recid, ln=argd['ln']) ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()] ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1])) link_ln = '' if argd['ln'] != CFG_SITE_LANG: link_ln = '?ln=%s' % argd['ln'] tabs = [(unordered_tabs[tab_id]['label'], \ '%s/record/%s/%s%s' % (CFG_SITE_URL, self.recid, tab_id, link_ln), \ tab_id in ['comments', 'reviews'], unordered_tabs[tab_id]['enabled']) \ for (tab_id, order) in ordered_tabs_id if unordered_tabs[tab_id]['visible'] == True] top = webstyle_templates.detailed_record_container_top(self.recid, tabs, argd['ln']) bottom = webstyle_templates.detailed_record_container_bottom(self.recid, tabs, argd['ln']) title, description, keywords = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln']) navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln']) if navtrail: navtrail += ' > ' navtrail += '<a class="navtrail" href="%s/record/%s?ln=%s">'% (CFG_SITE_URL, self.recid, argd['ln']) navtrail += title navtrail += '</a>' navtrail += ' > <a class="navtrail">%s</a>' % (self.discussion==1 and _("Reviews") or _("Comments")) return pageheaderonly(title=title, navtrail=navtrail, uid=uid, verbose=1, req=req, language=argd['ln'], navmenuid='search', navtrail_append_title_p=0) + \ websearch_templates.tmpl_search_pagestart(argd['ln']) + \ top + body + bottom + \ websearch_templates.tmpl_search_pageend(argd['ln']) + \ pagefooteronly(lastupdated=__lastupdated__, language=argd['ln'], req=req) else: return page(title=_("Record Not Found"), body=problem, uid=uid, verbose=1, req=req, language=argd['ln'], warnings=check_warnings, errors=[], navmenuid='search') # Return the same page wether we ask for /record/123 or /record/123/ __call__ = index def add(self, req, form): """ Add a comment (review) to record with id recid where recid>0 Also works for adding a remark to basket with id recid where recid<-99 @param ln: languange @param recid: record id @param action: 'DISPLAY' to display add form 'SUBMIT' to submit comment once form is filled 'REPLY' to reply to an already existing comment @param msg: the body of the comment/review or remark @param score: star score of the review @param note: title of the review @param comid: comment id, needed for replying @param editor_type: the type of editor used for submitting the comment: 'textarea', 'fckeditor'. @return: the full html page. """ argd = wash_urlargd(form, {'action': (str, "DISPLAY"), 'msg': (str, ""), 'note': (str, ''), 'score': (int, 0), 'comid': (int, -1), 'editor_type':(str, ""), }) _ = gettext_set_language(argd['ln']) actions = ['DISPLAY', 'REPLY', 'SUBMIT'] uid = getUid(req) user_info = collect_user_info(req) (auth_code_1, auth_msg_1) = check_user_can_view_comments(user_info, self.recid) (auth_code_2, auth_msg_2) = check_user_can_send_comments(user_info, self.recid) if (auth_code_1 or auth_code_2) and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)}) target = '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \ CFG_SITE_URL + user_info['uri']}, {}) return redirect_to_url(req, target) elif (auth_code_1 or auth_code_2): return page_not_authorized(req, "../", \ text = auth_msg_1 + auth_msg_2) can_attach_files = False (auth_code, auth_msg) = check_user_can_attach_file_to_comments(user_info, self.recid) if not auth_code: can_attach_files = True - client_ip_address = get_client_ip_address(req) + client_ip_address = req.remote_ip check_warnings = [] (ok, problem) = check_recID_is_in_range(self.recid, check_warnings, argd['ln']) if ok: title, description, keywords = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln']) navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid)) if navtrail: navtrail += ' > ' navtrail += '<a class="navtrail" href="%s/record/%s?ln=%s">'% (CFG_SITE_URL, self.recid, argd['ln']) navtrail += title navtrail += '</a>' navtrail += '> <a class="navtrail" href="%s/record/%s/%s/?ln=%s">%s</a>' % (CFG_SITE_URL, self.recid, self.discussion==1 and 'reviews' or 'comments', argd['ln'], self.discussion==1 and _('Reviews') or _('Comments')) if argd['action'] not in actions: argd['action'] = 'DISPLAY' # is page allowed to be viewed if uid == -1 or (not CFG_WEBCOMMENT_ALLOW_COMMENTS and not CFG_WEBCOMMENT_ALLOW_REVIEWS): return page_not_authorized(req, "../comments/add", navmenuid='search') # if guest, must log in first if isGuestUser(uid): referer = "%s/record/%s/%s/add?ln=%s&comid=%s&action=%s&score=%s" % (CFG_SITE_URL, self.recid, self.discussion == 1 and 'reviews' or 'comments', argd['ln'], argd['comid'], argd['action'], argd['score']) msg = _("Before you add your comment, you need to %(x_url_open)slogin%(x_url_close)s first.") % { 'x_url_open': '<a href="%s/youraccount/login?referer=%s">' % \ (CFG_SITE_SECURE_URL, urllib.quote(referer)), 'x_url_close': '</a>'} return page(title=_("Login"), body=msg, navtrail=navtrail, uid=uid, language=CFG_SITE_LANG, verbose=1, req=req, navmenuid='search') # user logged in else: (body, errors, warnings) = perform_request_add_comment_or_remark(recID=self.recid, ln=argd['ln'], uid=uid, action=argd['action'], msg=argd['msg'], note=argd['note'], score=argd['score'], reviews=self.discussion, comID=argd['comid'], client_ip_address=client_ip_address, editor_type=argd['editor_type'], can_attach_files=can_attach_files) if self.discussion: title = _("Add Review") else: title = _("Add Comment") return page(title=title, body=body, navtrail=navtrail, uid=uid, language=CFG_SITE_LANG, verbose=1, errors=errors, warnings=warnings, req=req, navmenuid='search') # id not in range else: return page(title=_("Record Not Found"), body=problem, uid=uid, verbose=1, req=req, warnings=check_warnings, errors=[], navmenuid='search') def vote(self, req, form): """ Vote positively or negatively for a comment/review. @param comid: comment/review id @param com_value: +1 to vote positively -1 to vote negatively @param recid: the id of the record the comment/review is associated with @param ln: language @param do: display order hh = highest helpful score, review only lh = lowest helpful score, review only hs = highest star score, review only ls = lowest star score, review only od = oldest date nd = newest date @param ds: display since all= no filtering by date nd = n days ago nw = n weeks ago nm = n months ago ny = n years ago where n is a single digit integer between 0 and 9 @param nb: number of results per page @param p: results page @param referer: http address of the calling function to redirect to (refresh) @param reviews: boolean, enabled for reviews, disabled for comments """ argd = wash_urlargd(form, {'comid': (int, -1), 'com_value': (int, 0), 'recid': (int, -1), 'do': (str, "od"), 'ds': (str, "all"), 'nb': (int, 100), 'p': (int, 1), 'referer': (str, None) }) - client_ip_address = get_client_ip_address(req) + client_ip_address = req.remote_ip uid = getUid(req) user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)}) target = '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \ CFG_SITE_URL + user_info['uri']}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg) success = perform_request_vote(argd['comid'], client_ip_address, argd['com_value'], uid) if argd['referer']: argd['referer'] += "?ln=%s&do=%s&ds=%s&nb=%s&p=%s&voted=%s&" % ( argd['ln'], argd['do'], argd['ds'], argd['nb'], argd['p'], success) redirect_to_url(req, argd['referer']) else: #Note: sent to comments display referer = "%s/record/%s/%s?&ln=%s&voted=1" referer %= (CFG_SITE_URL, self.recid, self.discussion == 1 and 'reviews' or 'comments', argd['ln']) redirect_to_url(req, referer) def report(self, req, form): """ Report a comment/review for inappropriate content @param comid: comment/review id @param recid: the id of the record the comment/review is associated with @param ln: language @param do: display order hh = highest helpful score, review only lh = lowest helpful score, review only hs = highest star score, review only ls = lowest star score, review only od = oldest date nd = newest date @param ds: display since all= no filtering by date nd = n days ago nw = n weeks ago nm = n months ago ny = n years ago where n is a single digit integer between 0 and 9 @param nb: number of results per page @param p: results page @param referer: http address of the calling function to redirect to (refresh) @param reviews: boolean, enabled for reviews, disabled for comments """ argd = wash_urlargd(form, {'comid': (int, -1), 'recid': (int, -1), 'do': (str, "od"), 'ds': (str, "all"), 'nb': (int, 100), 'p': (int, 1), 'referer': (str, None) }) - client_ip_address = get_client_ip_address(req) + client_ip_address = req.remote_ip uid = getUid(req) user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)}) target = '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \ CFG_SITE_URL + user_info['uri']}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg) success = perform_request_report(argd['comid'], client_ip_address, uid) if argd['referer']: argd['referer'] += "?ln=%s&do=%s&ds=%s&nb=%s&p=%s&reported=%s&" % (argd['ln'], argd['do'], argd['ds'], argd['nb'], argd['p'], str(success)) redirect_to_url(req, argd['referer']) else: #Note: sent to comments display referer = "%s/record/%s/%s/display?ln=%s&voted=1" referer %= (CFG_SITE_URL, self.recid, self.discussion==1 and 'reviews' or 'comments', argd['ln']) redirect_to_url(req, referer) class WebInterfaceCommentsFiles(WebInterfaceDirectory): """Handle upload and access to files for comments. The upload is currently only available through the FCKeditor. """ _exports = ['put'] # 'get' is handled by _lookup(..) def __init__(self, recid=-1, reviews=0): self.recid = recid self.discussion = reviews # 0:comments, 1:reviews def _lookup(self, component, path): """ This handler is invoked for the dynamic URLs (for getting and putting attachments) Eg: /record/5953/comments/attachments/get/652/file/myfile.pdf /record/5953/comments/attachments/get/550/image/myfigure.png """ if component == 'get' and len(path) > 2: uid = path[0] # uid of the submitter file_type = path[1] # file, image, flash or media (as # defined by FCKeditor) if file_type in ['file', 'image', 'flash', 'media']: file_name = '/'.join(path[2:]) # the filename def answer_get(req, form): """Accessing files attached to comments.""" form['file'] = file_name form['type'] = file_type form['uid'] = uid return self._get(req, form) return answer_get, [] # All other cases: file not found return None, [] def _get(self, req, form): """ Returns a file attached to a comment. A file is attached to a comment, by a user (who is the author of the comment), and is of a certain type (file, image, etc). Therefore these 3 values are part of the URL. Eg: CFG_SITE_URL/record/5953/comments/attachments/get/652/file/myfile.pdf """ argd = wash_urlargd(form, {'file': (str, None), 'type': (str, None), 'uid': (int, 0)}) # Can user view this record, i.e. can user access its # attachments? uid = getUid(req) user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)}) target = '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \ CFG_SITE_URL + user_info['uri']}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg) if not argd['file'] is None: # Prepare path to file on disk. Normalize the path so that # ../ and other dangerous components are removed. path = os.path.abspath('/opt/cds-invenio/var/data/comments/' + \ str(self.recid) + '/' + str(argd['uid']) + \ '/' + argd['type'] + '/' + argd['file']) # Check that we are really accessing attachements # directory, for the declared record. if path.startswith('/opt/cds-invenio/var/data/comments/' + \ str(self.recid)) and \ os.path.exists(path): return stream_file(req, path) # Send error 404 in all other cases return(apache.HTTP_NOT_FOUND) def put(self, req, form): """ Process requests received from FCKeditor to upload files, etc. """ if not fckeditor_available: return uid = getUid(req) # URL where the file can be fetched after upload user_files_path = '%(CFG_SITE_URL)s/record/%(recid)i/comments/attachments/get/%(uid)s' % \ {'uid': uid, 'recid': self.recid, 'CFG_SITE_URL': CFG_SITE_URL} # Path to directory where uploaded files are saved user_files_absolute_path = '%(CFG_PREFIX)s/var/data/comments/%(recid)s/%(uid)s' % \ {'uid': uid, 'recid': self.recid, 'CFG_PREFIX': CFG_PREFIX} # Create a Connector instance to handle the request conn = FCKeditorConnectorInvenio(form, recid=self.recid, uid=uid, allowed_commands=['QuickUpload'], allowed_types = ['File', 'Image', 'Flash', 'Media'], user_files_path = user_files_path, user_files_absolute_path = user_files_absolute_path) # Check that user can upload attachments for comments. user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_attach_file_to_comments(user_info, self.recid) if user_info['email'] == 'guest' and not user_info['apache_user']: # User is guest: must login prior to upload data = conn.sendUploadResults(1, '', '', 'Please login before uploading file.') elif auth_code: # User cannot submit data = conn.sendUploadResults(1, '', '', 'Sorry, you are not allowed to submit files.') else: # Process the upload and get the response data = conn.doResponse() # Transform the headers into something ok for mod_python for header in conn.headers: if not header is None: if header[0] == 'Content-Type': req.content_type = header[1] else: req.headers_out[header[0]] = header[1] # Send our response req.send_http_header() req.write(data) diff --git a/modules/websearch/lib/search_engine.py b/modules/websearch/lib/search_engine.py index ae724c50e..4d3b25ea8 100644 --- a/modules/websearch/lib/search_engine.py +++ b/modules/websearch/lib/search_engine.py @@ -1,4537 +1,4534 @@ # -*- coding: utf-8 -*- ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. # pylint: disable-msg=C0301 """CDS Invenio Search Engine in mod_python.""" __lastupdated__ = """$Date$""" __revision__ = "$Id$" ## import general modules: import cgi import copy import string import os import re import time import urllib import urlparse import zlib ## import CDS Invenio stuff: from invenio.config import \ CFG_CERN_SITE, \ CFG_OAI_ID_FIELD, \ CFG_WEBCOMMENT_ALLOW_REVIEWS, \ CFG_WEBSEARCH_CALL_BIBFORMAT, \ CFG_WEBSEARCH_CREATE_SIMILARLY_NAMED_AUTHORS_LINK_BOX, \ CFG_WEBSEARCH_FIELDS_CONVERT, \ CFG_WEBSEARCH_NB_RECORDS_TO_SORT, \ CFG_WEBSEARCH_SEARCH_CACHE_SIZE, \ CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS, \ CFG_WEBSEARCH_USE_ALEPH_SYSNOS, \ CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE, \ CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG, \ CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS, \ CFG_SITE_LANG, \ CFG_SITE_NAME, \ CFG_LOGDIR, \ CFG_SITE_URL from invenio.search_engine_config import InvenioWebSearchUnknownCollectionError from invenio.bibrecord import create_record from invenio.bibrank_record_sorter import get_bibrank_methods, rank_records, is_method_valid from invenio.bibrank_downloads_similarity import register_page_view_event, calculate_reading_similarity_list from invenio.bibindex_engine_stemmer import stem from invenio.bibformat import format_record, format_records, get_output_format_content_type, create_excel from invenio.bibformat_config import CFG_BIBFORMAT_USE_OLD_BIBFORMAT from invenio.bibrank_downloads_grapher import create_download_history_graph_and_box from invenio.data_cacher import DataCacher from invenio.websearch_external_collections import print_external_results_overview, perform_external_collection_search from invenio.access_control_admin import acc_get_action_id from invenio.access_control_config import VIEWRESTRCOLL, \ CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS from invenio.websearchadminlib import get_detailed_page_tabs from invenio.intbitset import intbitset as HitSet from invenio.dbquery import DatabaseError from invenio.access_control_engine import acc_authorize_action from invenio.errorlib import register_exception from invenio.textutils import encode_for_xml import invenio.template webstyle_templates = invenio.template.load('webstyle') webcomment_templates = invenio.template.load('webcomment') from invenio.bibrank_citation_searcher import calculate_cited_by_list, \ calculate_co_cited_with_list, get_records_with_num_cites, get_self_cited_by from invenio.bibrank_citation_grapher import create_citation_history_graph_and_box from invenio.dbquery import run_sql, run_sql_cached, get_table_update_time, Error from invenio.webuser import getUid, collect_user_info from invenio.webpage import pageheaderonly, pagefooteronly, create_error_box from invenio.messages import gettext_set_language from invenio.search_engine_query_parser import SearchQueryParenthesisedParser, \ InvenioWebSearchQueryParserException, SpiresToInvenioSyntaxConverter -try: - from mod_python import apache -except ImportError, e: - pass # ignore user personalisation, needed e.g. for command-line +from invenio import webinterface_handler_wsgi_utils as apache try: import invenio.template websearch_templates = invenio.template.load('websearch') except: pass ## global vars: cfg_nb_browse_seen_records = 100 # limit of the number of records to check when browsing certain collection cfg_nicely_ordered_collection_list = 0 # do we propose collection list nicely ordered or alphabetical? ## precompile some often-used regexp for speed reasons: re_word = re.compile('[\s]') re_quotes = re.compile('[\'\"]') re_doublequote = re.compile('\"') re_equal = re.compile('\=') re_logical_and = re.compile('\sand\s', re.I) re_logical_or = re.compile('\sor\s', re.I) re_logical_not = re.compile('\snot\s', re.I) re_operators = re.compile(r'\s([\+\-\|])\s') re_pattern_wildcards_at_beginning = re.compile(r'(\s)[\*\%]+') re_pattern_single_quotes = re.compile("'(.*?)'") re_pattern_double_quotes = re.compile("\"(.*?)\"") re_pattern_regexp_quotes = re.compile("\/(.*?)\/") re_pattern_short_words = re.compile(r'([\s\"]\w{1,3})[\*\%]+') re_pattern_space = re.compile("__SPACE__") re_pattern_today = re.compile("\$TODAY\$") re_pattern_parens = re.compile(r'\([^\)]+\s+[^\)]+\)') re_unicode_lowercase_a = re.compile(unicode(r"(?u)[áàäâãå]", "utf-8")) re_unicode_lowercase_ae = re.compile(unicode(r"(?u)[æ]", "utf-8")) re_unicode_lowercase_e = re.compile(unicode(r"(?u)[éèëê]", "utf-8")) re_unicode_lowercase_i = re.compile(unicode(r"(?u)[íìïî]", "utf-8")) re_unicode_lowercase_o = re.compile(unicode(r"(?u)[óòöôõø]", "utf-8")) re_unicode_lowercase_u = re.compile(unicode(r"(?u)[úùüû]", "utf-8")) re_unicode_lowercase_y = re.compile(unicode(r"(?u)[ýÿ]", "utf-8")) re_unicode_lowercase_c = re.compile(unicode(r"(?u)[çć]", "utf-8")) re_unicode_lowercase_n = re.compile(unicode(r"(?u)[ñ]", "utf-8")) re_unicode_uppercase_a = re.compile(unicode(r"(?u)[ÁÀÄÂÃÅ]", "utf-8")) re_unicode_uppercase_ae = re.compile(unicode(r"(?u)[Æ]", "utf-8")) re_unicode_uppercase_e = re.compile(unicode(r"(?u)[ÉÈËÊ]", "utf-8")) re_unicode_uppercase_i = re.compile(unicode(r"(?u)[ÍÌÏÎ]", "utf-8")) re_unicode_uppercase_o = re.compile(unicode(r"(?u)[ÓÒÖÔÕØ]", "utf-8")) re_unicode_uppercase_u = re.compile(unicode(r"(?u)[ÚÙÜÛ]", "utf-8")) re_unicode_uppercase_y = re.compile(unicode(r"(?u)[Ý]", "utf-8")) re_unicode_uppercase_c = re.compile(unicode(r"(?u)[ÇĆ]", "utf-8")) re_unicode_uppercase_n = re.compile(unicode(r"(?u)[Ñ]", "utf-8")) re_latex_lowercase_a = re.compile("\\\\[\"H'`~^vu=k]\{?a\}?") re_latex_lowercase_ae = re.compile("\\\\ae\\{\\}?") re_latex_lowercase_e = re.compile("\\\\[\"H'`~^vu=k]\\{?e\\}?") re_latex_lowercase_i = re.compile("\\\\[\"H'`~^vu=k]\\{?i\\}?") re_latex_lowercase_o = re.compile("\\\\[\"H'`~^vu=k]\\{?o\\}?") re_latex_lowercase_u = re.compile("\\\\[\"H'`~^vu=k]\\{?u\\}?") re_latex_lowercase_y = re.compile("\\\\[\"']\\{?y\\}?") re_latex_lowercase_c = re.compile("\\\\['uc]\\{?c\\}?") re_latex_lowercase_n = re.compile("\\\\[c'~^vu]\\{?n\\}?") re_latex_uppercase_a = re.compile("\\\\[\"H'`~^vu=k]\\{?A\\}?") re_latex_uppercase_ae = re.compile("\\\\AE\\{?\\}?") re_latex_uppercase_e = re.compile("\\\\[\"H'`~^vu=k]\\{?E\\}?") re_latex_uppercase_i = re.compile("\\\\[\"H'`~^vu=k]\\{?I\\}?") re_latex_uppercase_o = re.compile("\\\\[\"H'`~^vu=k]\\{?O\\}?") re_latex_uppercase_u = re.compile("\\\\[\"H'`~^vu=k]\\{?U\\}?") re_latex_uppercase_y = re.compile("\\\\[\"']\\{?Y\\}?") re_latex_uppercase_c = re.compile("\\\\['uc]\\{?C\\}?") re_latex_uppercase_n = re.compile("\\\\[c'~^vu]\\{?N\\}?") class RestrictedCollectionDataCacher(DataCacher): def __init__(self): def cache_filler(): ret = [] try: viewcollid = acc_get_action_id(VIEWRESTRCOLL) res = run_sql("""SELECT DISTINCT ar.value FROM accROLE_accACTION_accARGUMENT raa JOIN accARGUMENT ar ON raa.id_accARGUMENT = ar.id WHERE ar.keyword = 'collection' AND raa.id_accACTION = %s""", (viewcollid,)) except Exception: # database problems, return empty cache return [] for coll in res: ret.append(coll[0]) return ret def timestamp_verifier(): return max(get_table_update_time('accROLE_accACTION_accARGUMENT'), get_table_update_time('accARGUMENT')) DataCacher.__init__(self, cache_filler, timestamp_verifier) def collection_restricted_p(collection): restricted_collection_cache.recreate_cache_if_needed() return collection in restricted_collection_cache.cache try: restricted_collection_cache.is_ok_p except Exception: restricted_collection_cache = RestrictedCollectionDataCacher() def get_permitted_restricted_collections(user_info): """Return a list of collection that are restricted but for which the user is authorized.""" restricted_collection_cache.recreate_cache_if_needed() ret = [] for collection in restricted_collection_cache.cache: if acc_authorize_action(user_info, 'viewrestrcoll', collection=collection)[0] == 0: ret.append(collection) return ret def is_user_owner_of_record(user_info, recid): """ Check if the user is owner of the record, i.e. he is the submitter and/or belongs to a owner-like group authorized to 'see' the record. @param user_info: the user_info dictionary that describe the user. @type user_info: user_info dictionary @param recid: the record identifier. @type recid: positive integer @return: True if the user is 'owner' of the record; False otherwise @rtype: bool """ authorized_emails_or_group = [] for tag in CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS: authorized_emails_or_group.extend(get_fieldvalues(recid, tag)) for email_or_group in authorized_emails_or_group: if email_or_group in user_info['group']: return True email = email_or_group.strip().lower() if user_info['email'].strip().lower() == email: return True return False def check_user_can_view_record(user_info, recid): """ Check if the user is authorized to view the given recid. The function grants access in two cases: either user has author rights on this record, or he has view rights to the primary collection this record belongs to. @param user_info: the user_info dictionary that describe the user. @type user_info: user_info dictionary @param recid: the record identifier. @type recid: positive integer @return: (0, ''), when authorization is granted, (>0, 'message') when authorization is not granted @rtype: (int, string) """ record_primary_collection = guess_primary_collection_of_a_record(recid) if collection_restricted_p(record_primary_collection): (auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=record_primary_collection) if auth_code == 0 or is_user_owner_of_record(user_info, recid): return (0, '') else: return (auth_code, auth_msg) else: return (0, '') class IndexStemmingDataCacher(DataCacher): """ Provides cache for stemming information for word/phrase indexes. This class is not to be used directly; use function get_index_stemming_language() instead. """ def __init__(self): def cache_filler(): try: res = run_sql("""SELECT id, stemming_language FROM idxINDEX""") except DatabaseError: # database problems, return empty cache return {} return dict(res) def timestamp_verifier(): return get_table_update_time('idxINDEX') DataCacher.__init__(self, cache_filler, timestamp_verifier) try: index_stemming_cache.is_ok_p except Exception: index_stemming_cache = IndexStemmingDataCacher() def get_index_stemming_language(index_id): """Return stemming langugage for given index.""" index_stemming_cache.recreate_cache_if_needed() return index_stemming_cache.cache[index_id] class CollectionRecListDataCacher(DataCacher): """ Provides cache for collection reclist hitsets. This class is not to be used directly; use function get_collection_reclist() instead. """ def __init__(self): def cache_filler(): ret = {} try: res = run_sql("SELECT name,reclist FROM collection") except Exception: # database problems, return empty cache return {} for name, reclist in res: ret[name] = None # this will be filled later during runtime by calling get_collection_reclist(coll) return ret def timestamp_verifier(): return get_table_update_time('collection') DataCacher.__init__(self, cache_filler, timestamp_verifier) try: if not collection_reclist_cache.is_ok_p: raise Exception except Exception: collection_reclist_cache = CollectionRecListDataCacher() def get_collection_reclist(coll): """Return hitset of recIDs that belong to the collection 'coll'.""" collection_reclist_cache.recreate_cache_if_needed() if not collection_reclist_cache.cache[coll]: # not yet it the cache, so calculate it and fill the cache: set = HitSet() query = "SELECT nbrecs,reclist FROM collection WHERE name=%s" res = run_sql(query, (coll, ), 1) if res: try: set = HitSet(res[0][1]) except: pass collection_reclist_cache.cache[coll] = set # finally, return reclist: return collection_reclist_cache.cache[coll] class SearchResultsCache(DataCacher): """ Provides temporary lazy cache for Search Results. Useful when users click on `next page'. """ def __init__(self): def cache_filler(): return {} def timestamp_verifier(): return '1970-01-01 00:00:00' # lazy cache is always okay; # its filling is governed by # CFG_WEBSEARCH_SEARCH_CACHE_SIZE DataCacher.__init__(self, cache_filler, timestamp_verifier) try: if not search_results_cache.is_ok_p: raise Exception except Exception: search_results_cache = SearchResultsCache() class CollectionI18nNameDataCacher(DataCacher): """ Provides cache for I18N collection names. This class is not to be used directly; use function get_coll_i18nname() instead. """ def __init__(self): def cache_filler(): ret = {} try: res = run_sql("SELECT c.name,cn.ln,cn.value FROM collectionname AS cn, collection AS c WHERE cn.id_collection=c.id AND cn.type='ln'") # ln=long name except Exception: # database problems return {} for c, ln, i18nname in res: if i18nname: if not ret.has_key(c): ret[c] = {} ret[c][ln] = i18nname return ret def timestamp_verifier(): return get_table_update_time('collectionname') DataCacher.__init__(self, cache_filler, timestamp_verifier) try: if not collection_i18nname_cache.is_ok_p: raise Exception except Exception: collection_i18nname_cache = CollectionI18nNameDataCacher() def get_coll_i18nname(c, ln=CFG_SITE_LANG, verify_cache_timestamp=True): """ Return nicely formatted collection name (of the name type `ln' (=long name)) for collection C in language LN. This function uses collection_i18nname_cache, but it verifies whether the cache is up-to-date first by default. This verification step is performed by checking the DB table update time. So, if you call this function 1000 times, it can get very slow because it will do 1000 table update time verifications, even though collection names change not that often. Hence the parameter VERIFY_CACHE_TIMESTAMP which, when set to False, will assume the cache is already up-to-date. This is useful namely in the generation of collection lists for the search results page. """ if verify_cache_timestamp: collection_i18nname_cache.recreate_cache_if_needed() out = c try: out = collection_i18nname_cache.cache[c][ln] except KeyError: pass # translation in LN does not exist return out class FieldI18nNameDataCacher(DataCacher): """ Provides cache for I18N field names. This class is not to be used directly; use function get_field_i18nname() instead. """ def __init__(self): def cache_filler(): ret = {} try: res = run_sql("SELECT f.name,fn.ln,fn.value FROM fieldname AS fn, field AS f WHERE fn.id_field=f.id AND fn.type='ln'") # ln=long name except Exception: # database problems, return empty cache return {} for f, ln, i18nname in res: if i18nname: if not ret.has_key(f): ret[f] = {} ret[f][ln] = i18nname return ret def timestamp_verifier(): return get_table_update_time('fieldname') DataCacher.__init__(self, cache_filler, timestamp_verifier) try: if not field_i18nname_cache.is_ok_p: raise Exception except Exception: field_i18nname_cache = FieldI18nNameDataCacher() def get_field_i18nname(f, ln=CFG_SITE_LANG, verify_cache_timestamp=True): """ Return nicely formatted field name (of type 'ln', 'long name') for field F in language LN. If VERIFY_CACHE_TIMESTAMP is set to True, then verify DB timestamp and field I18N name cache timestamp and refresh cache from the DB if needed. Otherwise don't bother checking DB timestamp and return the cached value. (This is useful when get_field_i18nname is called inside a loop.) """ if verify_cache_timestamp: field_i18nname_cache.recreate_cache_if_needed() out = f try: out = field_i18nname_cache.cache[f][ln] except KeyError: pass # translation in LN does not exist return out def get_alphabetically_ordered_collection_list(level=0, ln=CFG_SITE_LANG): """Returns nicely ordered (score respected) list of collections, more exactly list of tuples (collection name, printable collection name). Suitable for create_search_box().""" out = [] res = run_sql_cached("SELECT id,name FROM collection ORDER BY name ASC", affected_tables=['collection',]) for c_id, c_name in res: # make a nice printable name (e.g. truncate c_printable for # long collection names in given language): c_printable_fullname = get_coll_i18nname(c_name, ln, False) c_printable = wash_index_term(c_printable_fullname, 30, False) if c_printable != c_printable_fullname: c_printable = c_printable + "..." if level: c_printable = " " + level * '-' + " " + c_printable out.append([c_name, c_printable]) return out def get_nicely_ordered_collection_list(collid=1, level=0, ln=CFG_SITE_LANG): """Returns nicely ordered (score respected) list of collections, more exactly list of tuples (collection name, printable collection name). Suitable for create_search_box().""" colls_nicely_ordered = [] res = run_sql("""SELECT c.name,cc.id_son FROM collection_collection AS cc, collection AS c WHERE c.id=cc.id_son AND cc.id_dad=%s ORDER BY score DESC""", (collid, )) for c, cid in res: # make a nice printable name (e.g. truncate c_printable for # long collection names in given language): c_printable_fullname = get_coll_i18nname(c, ln, False) c_printable = wash_index_term(c_printable_fullname, 30, False) if c_printable != c_printable_fullname: c_printable = c_printable + "..." if level: c_printable = " " + level * '-' + " " + c_printable colls_nicely_ordered.append([c, c_printable]) colls_nicely_ordered = colls_nicely_ordered + get_nicely_ordered_collection_list(cid, level+1, ln=ln) return colls_nicely_ordered def get_index_id_from_field(field): """ Return index id with name corresponding to FIELD, or the first index id where the logical field code named FIELD is indexed. Return zero in case there is no index defined for this field. Example: field='author', output=4. """ out = 0 if field == '': field = 'global' # empty string field means 'global' index (field 'anyfield') # first look in the index table: res = run_sql("""SELECT id FROM idxINDEX WHERE name=%s""", (field,)) if res: out = res[0][0] return out # not found in the index table, now look in the field table: res = run_sql("""SELECT w.id FROM idxINDEX AS w, idxINDEX_field AS wf, field AS f WHERE f.code=%s AND wf.id_field=f.id AND w.id=wf.id_idxINDEX LIMIT 1""", (field,)) if res: out = res[0][0] return out def get_words_from_pattern(pattern): "Returns list of whitespace-separated words from pattern." words = {} for word in string.split(pattern): if not words.has_key(word): words[word] = 1; return words.keys() def create_basic_search_units(req, p, f, m=None, of='hb'): """Splits search pattern and search field into a list of independently searchable units. - A search unit consists of '(operator, pattern, field, type, hitset)' tuples where 'operator' is set union (|), set intersection (+) or set exclusion (-); 'pattern' is either a word (e.g. muon*) or a phrase (e.g. 'nuclear physics'); 'field' is either a code like 'title' or MARC tag like '100__a'; 'type' is the search type ('w' for word file search, 'a' for access file search). - Optionally, the function accepts the match type argument 'm'. If it is set (e.g. from advanced search interface), then it performs this kind of matching. If it is not set, then a guess is made. 'm' can have values: 'a'='all of the words', 'o'='any of the words', 'p'='phrase/substring', 'r'='regular expression', 'e'='exact value'. - Warnings are printed on req (when not None) in case of HTML output formats.""" opfts = [] # will hold (o,p,f,t,h) units # FIXME: quick hack for the journal index if f == 'journal': opfts.append(['+', p, f, 'w']) return opfts ## check arguments: is desired matching type set? if m: ## A - matching type is known; good! if m == 'e': # A1 - exact value: opfts.append(['+', p, f, 'a']) # '+' since we have only one unit elif m == 'p': # A2 - phrase/substring: opfts.append(['+', "%" + p + "%", f, 'a']) # '+' since we have only one unit elif m == 'r': # A3 - regular expression: opfts.append(['+', p, f, 'r']) # '+' since we have only one unit elif m == 'a' or m == 'w': # A4 - all of the words: p = strip_accents(p) # strip accents for 'w' mode, FIXME: delete when not needed for word in get_words_from_pattern(p): opfts.append(['+', word, f, 'w']) # '+' in all units elif m == 'o': # A5 - any of the words: p = strip_accents(p) # strip accents for 'w' mode, FIXME: delete when not needed for word in get_words_from_pattern(p): if len(opfts)==0: opfts.append(['+', word, f, 'w']) # '+' in the first unit else: opfts.append(['|', word, f, 'w']) # '|' in further units else: if of.startswith("h"): print_warning(req, "Matching type '%s' is not implemented yet." % cgi.escape(m), "Warning") opfts.append(['+', "%" + p + "%", f, 'w']) else: ## B - matching type is not known: let us try to determine it by some heuristics if f and p[0] == '"' and p[-1] == '"': ## B0 - does 'p' start and end by double quote, and is 'f' defined? => doing ACC search opfts.append(['+', p[1:-1], f, 'a']) elif f and p[0] == "'" and p[-1] == "'": ## B0bis - does 'p' start and end by single quote, and is 'f' defined? => doing ACC search opfts.append(['+', '%' + p[1:-1] + '%', f, 'a']) elif f and p[0] == "/" and p[-1] == "/": ## B0ter - does 'p' start and end by a slash, and is 'f' defined? => doing regexp search opfts.append(['+', p[1:-1], f, 'r']) elif f and string.find(p, ',') >= 0: ## B1 - does 'p' contain comma, and is 'f' defined? => doing ACC search opfts.append(['+', p, f, 'a']) elif f and str(f[0:2]).isdigit(): ## B2 - does 'f' exist and starts by two digits? => doing ACC search opfts.append(['+', p, f, 'a']) else: ## B3 - doing WRD search, but maybe ACC too # search units are separated by spaces unless the space is within single or double quotes # so, let us replace temporarily any space within quotes by '__SPACE__' p = re_pattern_single_quotes.sub(lambda x: "'"+string.replace(x.group(1), ' ', '__SPACE__')+"'", p) p = re_pattern_double_quotes.sub(lambda x: "\""+string.replace(x.group(1), ' ', '__SPACE__')+"\"", p) p = re_pattern_regexp_quotes.sub(lambda x: "/"+string.replace(x.group(1), ' ', '__SPACE__')+"/", p) # wash argument: p = re_equal.sub(":", p) p = re_logical_and.sub(" ", p) p = re_logical_or.sub(" |", p) p = re_logical_not.sub(" -", p) p = re_operators.sub(r' \1', p) for pi in string.split(p): # iterate through separated units (or items, as "pi" stands for "p item") pi = re_pattern_space.sub(" ", pi) # replace back '__SPACE__' by ' ' # firstly, determine set operator if pi[0] == '+' or pi[0] == '-' or pi[0] == '|': oi = pi[0] pi = pi[1:] else: # okay, there is no operator, so let us decide what to do by default oi = '+' # by default we are doing set intersection... # secondly, determine search pattern and field: if string.find(pi, ":") > 0: fi, pi = string.split(pi, ":", 1) # test whether fi is a real index code or a MARC-tag defined code: if fi in get_fieldcodes() or '00' <= fi[:2] <= '99': pass else: # it is not, so join it back: fi, pi = f, fi + ":" + pi else: fi, pi = f, pi # look also for old ALEPH field names: if fi and CFG_WEBSEARCH_FIELDS_CONVERT.has_key(string.lower(fi)): fi = CFG_WEBSEARCH_FIELDS_CONVERT[string.lower(fi)] # wash 'pi' argument: if re_quotes.match(pi): # B3a - quotes are found => do ACC search (phrase search) if pi[0] == '"' and pi[-1] == '"': pi = string.replace(pi, '"', '') # remove quote signs opfts.append([oi, pi, fi, 'a']) elif pi[0] == "'" and pi[-1] == "'": pi = string.replace(pi, "'", "") # remove quote signs opfts.append([oi, "%" + pi + "%", fi, 'a']) else: # unbalanced quotes, so fall back to WRD query: opfts.append([oi, pi, fi, 'w']) elif fi and str(fi[0]).isdigit() and str(fi[0]).isdigit(): # B3b - fi exists and starts by two digits => do ACC search opfts.append([oi, pi, fi, 'a']) elif fi and not get_index_id_from_field(fi) and get_field_name(fi): # B3c - logical field fi exists but there is no WRD index for fi => try ACC search opfts.append([oi, pi, fi, 'a']) elif pi.startswith('/') and pi.endswith('/'): # B3d - pi has slashes around => do regexp search opfts.append([oi, pi[1:-1], fi, 'r']) else: # B3e - general case => do WRD search pi = strip_accents(pi) # strip accents for 'w' mode, FIXME: delete when not needed for pii in get_words_from_pattern(pi): opfts.append([oi, pii, fi, 'w']) ## sanity check: for i in range(0, len(opfts)): try: pi = opfts[i][1] if pi == '*': if of.startswith("h"): print_warning(req, "Ignoring standalone wildcard word.", "Warning") del opfts[i] if pi == '' or pi == ' ': fi = opfts[i][2] if fi: if of.startswith("h"): print_warning(req, "Ignoring empty <em>%s</em> search term." % fi, "Warning") del opfts[i] except: pass ## return search units: return opfts def page_start(req, of, cc, aas, ln, uid, title_message=None, description='', keywords='', recID=-1, tab='', p=''): "Start page according to given output format." _ = gettext_set_language(ln) if not req: return # we were called from CLI if not title_message: title_message = _("Search Results") content_type = get_output_format_content_type(of) if of.startswith('x'): if of == 'xr': # we are doing RSS output req.content_type = "application/rss+xml" req.send_http_header() req.write("""<?xml version="1.0" encoding="UTF-8"?>\n""") else: # we are doing XML output: req.content_type = "text/xml" req.send_http_header() req.write("""<?xml version="1.0" encoding="UTF-8"?>\n""") elif of.startswith('t') or str(of[0:3]).isdigit(): # we are doing plain text output: req.content_type = "text/plain" req.send_http_header() elif of == "id": pass # nothing to do, we shall only return list of recIDs elif content_type == 'text/html': # we are doing HTML output: req.content_type = "text/html" req.send_http_header() if not description: description = "%s %s." % (cc, _("Search Results")) if not keywords: keywords = "%s, WebSearch, %s" % (get_coll_i18nname(CFG_SITE_NAME, ln, False), get_coll_i18nname(cc, ln, False)) ## generate RSS URL: argd = {} if req.args: argd = cgi.parse_qs(req.args) rssurl = websearch_templates.build_rss_url(argd) ## add jsmath if displaying single records (FIXME: find ## eventual better place to this code) if of.lower() in CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS: metaheaderadd = """ <script type='text/javascript'> jsMath = { Controls: {cookie: {printwarn: 0}} }; </script> <script src='/jsMath/easy/invenio-jsmath.js' type='text/javascript'></script> """ else: metaheaderadd = '' ## generate navtrail: navtrail = create_navtrail_links(cc, aas, ln) if navtrail != '': navtrail += ' > ' if (tab != '' or ((of != '' or of.lower() != 'hd') and of != 'hb')) and \ recID != -1: # If we are not in information tab in HD format, customize # the nav. trail to have a link back to main record. (Due # to the way perform_request_search() works, hb # (lowercase) is equal to hd) navtrail += ' <a class="navtrail" href="%s/record/%s">%s</a>' % \ (CFG_SITE_URL, recID, title_message) if (of != '' or of.lower() != 'hd') and of != 'hb': # Export format_name = of query = "SELECT name FROM format WHERE code=%s" res = run_sql(query, (of,)) if res: format_name = res[0][0] navtrail += ' > ' + format_name else: # Discussion, citations, etc. tabs tab_label = get_detailed_page_tabs(cc, ln=ln)[tab]['label'] navtrail += ' > ' + _(tab_label) else: navtrail += title_message if p: # we are serving search/browse results pages, so insert pattern: navtrail += ": " + cgi.escape(p) title_message = cgi.escape(p) + " - " + title_message ## finally, print page header: req.write(pageheaderonly(req=req, title=title_message, navtrail=navtrail, description=description, keywords=keywords, metaheaderadd=metaheaderadd, uid=uid, language=ln, navmenuid='search', navtrail_append_title_p=0, rssurl=rssurl)) req.write(websearch_templates.tmpl_search_pagestart(ln=ln)) #else: # req.send_http_header() def page_end(req, of="hb", ln=CFG_SITE_LANG): "End page according to given output format: e.g. close XML tags, add HTML footer, etc." if of == "id": return [] # empty recID list if not req: return # we were called from CLI if of.startswith('h'): req.write(websearch_templates.tmpl_search_pageend(ln = ln)) # pagebody end req.write(pagefooteronly(lastupdated=__lastupdated__, language=ln, req=req)) return "\n" def create_page_title_search_pattern_info(p, p1, p2, p3): """Create the search pattern bit for the page <title> web page HTML header. Basically combine p and (p1,p2,p3) together so that the page header may be filled whether we are in the Simple Search or Advanced Search interface contexts.""" out = "" if p: out = p else: out = p1 if p2: out += ' ' + p2 if p3: out += ' ' + p3 return out def create_inputdate_box(name="d1", selected_year=0, selected_month=0, selected_day=0, ln=CFG_SITE_LANG): "Produces 'From Date', 'Until Date' kind of selection box. Suitable for search options." _ = gettext_set_language(ln) box = "" # day box += """<select name="%sd">""" % name box += """<option value="">%s""" % _("any day") for day in range(1, 32): box += """<option value="%02d"%s>%02d""" % (day, is_selected(day, selected_day), day) box += """</select>""" # month box += """<select name="%sm">""" % name box += """<option value="">%s""" % _("any month") for mm, month in [(1, _("January")), (2, _("February")), (3, _("March")), (4, _("April")), \ (5, _("May")), (6, _("June")), (7, _("July")), (8, _("August")), \ (9, _("September")), (10, _("October")), (11, _("November")), (12, _("December"))]: box += """<option value="%02d"%s>%s""" % (mm, is_selected(mm, selected_month), month) box += """</select>""" # year box += """<select name="%sy">""" % name box += """<option value="">%s""" % _("any year") this_year = int(time.strftime("%Y", time.localtime())) for year in range(this_year-20, this_year+1): box += """<option value="%d"%s>%d""" % (year, is_selected(year, selected_year), year) box += """</select>""" return box def create_search_box(cc, colls, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1, p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action=""): """Create search box for 'search again in the results page' functionality.""" # load the right message language _ = gettext_set_language(ln) # some computations cc_intl = get_coll_i18nname(cc, ln, False) cc_colID = get_colID(cc) colls_nicely_ordered = [] if cfg_nicely_ordered_collection_list: colls_nicely_ordered = get_nicely_ordered_collection_list(ln=ln) else: colls_nicely_ordered = get_alphabetically_ordered_collection_list(ln=ln) colls_nice = [] for (cx, cx_printable) in colls_nicely_ordered: if not cx.startswith("Unnamed collection"): colls_nice.append({ 'value' : cx, 'text' : cx_printable }) coll_selects = [] if colls and colls[0] != CFG_SITE_NAME: # some collections are defined, so print these first, and only then print 'add another collection' heading: for c in colls: if c: temp = [] temp.append({ 'value' : CFG_SITE_NAME, 'text' : '*** %s ***' % _("any public collection") }) for val in colls_nice: # print collection: if not cx.startswith("Unnamed collection"): temp.append({ 'value' : val['value'], 'text' : val['text'], 'selected' : (c == re.sub("^[\s\-]*","", val['value'])) }) coll_selects.append(temp) coll_selects.append([{ 'value' : '', 'text' : '*** %s ***' % _("add another collection") }] + colls_nice) else: # we searched in CFG_SITE_NAME, so print 'any public collection' heading coll_selects.append([{ 'value' : CFG_SITE_NAME, 'text' : '*** %s ***' % _("any public collection") }] + colls_nice) ## ranking methods ranks = [{ 'value' : '', 'text' : "- %s %s -" % (_("OR").lower (), _("rank by")), }] for (code, name) in get_bibrank_methods(cc_colID, ln): # propose found rank methods: ranks.append({ 'value' : code, 'text' : name, }) formats = [] query = """SELECT code,name FROM format WHERE visibility='1' ORDER BY name ASC""" res = run_sql(query) if res: # propose found formats: for code, name in res: formats.append({ 'value' : code, 'text' : name }) else: formats.append({'value' : 'hb', 'text' : _("HTML brief") }) # show collections in the search box? (not if there is only one # collection defined, and not if we are in light search) show_colls = True if len(collection_reclist_cache.cache.keys()) == 1 or \ aas == -1: show_colls = False return websearch_templates.tmpl_search_box( ln = ln, aas = aas, cc_intl = cc_intl, cc = cc, ot = ot, sp = sp, action = action, fieldslist = get_searchwithin_fields(ln=ln, colID=cc_colID), f1 = f1, f2 = f2, f3 = f3, m1 = m1, m2 = m2, m3 = m3, p1 = p1, p2 = p2, p3 = p3, op1 = op1, op2 = op2, rm = rm, p = p, f = f, coll_selects = coll_selects, d1y = d1y, d2y = d2y, d1m = d1m, d2m = d2m, d1d = d1d, d2d = d2d, dt = dt, sort_fields = get_sortby_fields(ln=ln, colID=cc_colID), sf = sf, so = so, ranks = ranks, sc = sc, rg = rg, formats = formats, of = of, pl = pl, jrec = jrec, ec = ec, show_colls = show_colls, ) def create_navtrail_links(cc=CFG_SITE_NAME, aas=0, ln=CFG_SITE_LANG, self_p=1, tab=''): """Creates navigation trail links, i.e. links to collection ancestors (except Home collection). If aas==1, then links to Advanced Search interfaces; otherwise Simple Search. """ dads = [] for dad in get_coll_ancestors(cc): if dad != CFG_SITE_NAME: # exclude Home collection dads.append ((dad, get_coll_i18nname(dad, ln, False))) if self_p and cc != CFG_SITE_NAME: dads.append((cc, get_coll_i18nname(cc, ln, False))) return websearch_templates.tmpl_navtrail_links( aas=aas, ln=ln, dads=dads) def get_searchwithin_fields(ln='en', colID=None): """Retrieves the fields name used in the 'search within' selection box for the collection ID colID.""" res = None if colID: res = run_sql_cached("""SELECT f.code,f.name FROM field AS f, collection_field_fieldvalue AS cff WHERE cff.type='sew' AND cff.id_collection=%s AND cff.id_field=f.id ORDER BY cff.score DESC, f.name ASC""", (colID,), affected_tables=['field', 'collection_field_fieldvalue']) if not res: res = run_sql_cached("SELECT code,name FROM field ORDER BY name ASC", affected_tables=['field',]) fields = [{ 'value' : '', 'text' : get_field_i18nname("any field", ln, False) }] for field_code, field_name in res: if field_code and field_code != "anyfield": fields.append({ 'value' : field_code, 'text' : get_field_i18nname(field_name, ln, False) }) return fields def get_sortby_fields(ln='en', colID=None): """Retrieves the fields name used in the 'sort by' selection box for the collection ID colID.""" _ = gettext_set_language(ln) res = None if colID: res = run_sql_cached("""SELECT DISTINCT(f.code),f.name FROM field AS f, collection_field_fieldvalue AS cff WHERE cff.type='soo' AND cff.id_collection=%s AND cff.id_field=f.id ORDER BY cff.score DESC, f.name ASC""", (colID,), affected_tables=['field', 'collection_field_fieldvalue']) if not res: # no sort fields defined for this colID, try to take Home collection: res = run_sql_cached("""SELECT DISTINCT(f.code),f.name FROM field AS f, collection_field_fieldvalue AS cff WHERE cff.type='soo' AND cff.id_collection=%s AND cff.id_field=f.id ORDER BY cff.score DESC, f.name ASC""", (1,), affected_tables=['field', 'collection_field_fieldvalue']) if not res: # no sort fields defined for the Home collection, take all sort fields defined wherever they are: res = run_sql_cached("""SELECT DISTINCT(f.code),f.name FROM field AS f, collection_field_fieldvalue AS cff WHERE cff.type='soo' AND cff.id_field=f.id ORDER BY cff.score DESC, f.name ASC""", affected_tables=['field', 'collection_field_fieldvalue']) fields = [{ 'value' : '', 'text' : _("latest first") }] for field_code, field_name in res: if field_code and field_code != "anyfield": fields.append({ 'value' : field_code, 'text' : get_field_i18nname(field_name, ln, False) }) return fields def create_andornot_box(name='op', value='', ln='en'): "Returns HTML code for the AND/OR/NOT selection box." _ = gettext_set_language(ln) out = """ <select name="%s"> <option value="a"%s>%s <option value="o"%s>%s <option value="n"%s>%s </select> """ % (name, is_selected('a', value), _("AND"), is_selected('o', value), _("OR"), is_selected('n', value), _("AND NOT")) return out def create_matchtype_box(name='m', value='', ln='en'): "Returns HTML code for the 'match type' selection box." _ = gettext_set_language(ln) out = """ <select name="%s"> <option value="a"%s>%s <option value="o"%s>%s <option value="e"%s>%s <option value="p"%s>%s <option value="r"%s>%s </select> """ % (name, is_selected('a', value), _("All of the words:"), is_selected('o', value), _("Any of the words:"), is_selected('e', value), _("Exact phrase:"), is_selected('p', value), _("Partial phrase:"), is_selected('r', value), _("Regular expression:")) return out def is_selected(var, fld): "Checks if the two are equal, and if yes, returns ' selected'. Useful for select boxes." if type(var) is int and type(fld) is int: if var == fld: return " selected" elif str(var) == str(fld): return " selected" elif fld and len(fld)==3 and fld[0] == "w" and var == fld[1:]: return " selected" return "" def wash_colls(cc, c, split_colls=0): """Wash collection list by checking whether user has deselected anything under 'Narrow search'. Checks also if cc is a list or not. Return list of cc, colls_to_display, colls_to_search since the list of collections to display is different from that to search in. This is because users might have chosen 'split by collection' functionality. The behaviour of "collections to display" depends solely whether user has deselected a particular collection: e.g. if it started from 'Articles and Preprints' page, and deselected 'Preprints', then collection to display is 'Articles'. If he did not deselect anything, then collection to display is 'Articles & Preprints'. The behaviour of "collections to search in" depends on the 'split_colls' parameter: * if is equal to 1, then we can wash the colls list down and search solely in the collection the user started from; * if is equal to 0, then we are splitting to the first level of collections, i.e. collections as they appear on the page we started to search from; The function raises exception InvenioWebSearchUnknownCollectionError if cc or one of c collections is not known. """ colls_out = [] colls_out_for_display = [] # check what type is 'cc': if type(cc) is list: for ci in cc: if collection_reclist_cache.cache.has_key(ci): # yes this collection is real, so use it: cc = ci break else: # check once if cc is real: if not collection_reclist_cache.cache.has_key(cc): if cc: raise InvenioWebSearchUnknownCollectionError(cc) else: cc = CFG_SITE_NAME # cc is not set, so replace it with Home collection # check type of 'c' argument: if type(c) is list: colls = c else: colls = [c] # remove all 'unreal' collections: colls_real = [] for coll in colls: if collection_reclist_cache.cache.has_key(coll): colls_real.append(coll) else: if coll: raise InvenioWebSearchUnknownCollectionError(coll) colls = colls_real # check if some real collections remain: if len(colls)==0: colls = [cc] # then let us check the list of non-restricted "real" sons of 'cc' and compare it to 'coll': res = run_sql("""SELECT c.name FROM collection AS c, collection_collection AS cc, collection AS ccc WHERE c.id=cc.id_son AND cc.id_dad=ccc.id AND ccc.name=%s AND cc.type='r'""", (cc,)) l_cc_nonrestricted_sons = [] l_c = colls for row in res: if not collection_restricted_p(row[0]): l_cc_nonrestricted_sons.append(row[0]) l_c.sort() l_cc_nonrestricted_sons.sort() if l_cc_nonrestricted_sons == l_c: colls_out_for_display = [cc] # yep, washing permitted, it is sufficient to display 'cc' else: colls_out_for_display = colls # nope, we need to display all 'colls' successively # remove duplicates: colls_out_for_display_nondups=filter(lambda x, colls_out_for_display=colls_out_for_display: colls_out_for_display[x-1] not in colls_out_for_display[x:], range(1, len(colls_out_for_display)+1)) colls_out_for_display = map(lambda x, colls_out_for_display=colls_out_for_display:colls_out_for_display[x-1], colls_out_for_display_nondups) # second, let us decide on collection splitting: if split_colls == 0: # type A - no sons are wanted colls_out = colls_out_for_display # elif split_colls == 1: else: # type B - sons (first-level descendants) are wanted for coll in colls_out_for_display: coll_sons = get_coll_sons(coll) if coll_sons == []: colls_out.append(coll) else: colls_out = colls_out + coll_sons # remove duplicates: colls_out_nondups=filter(lambda x, colls_out=colls_out: colls_out[x-1] not in colls_out[x:], range(1, len(colls_out)+1)) colls_out = map(lambda x, colls_out=colls_out:colls_out[x-1], colls_out_nondups) return (cc, colls_out_for_display, colls_out) def strip_accents(x): """Strip accents in the input phrase X (assumed in UTF-8) by replacing accented characters with their unaccented cousins (e.g. é by e). Return such a stripped X.""" x = re_latex_lowercase_a.sub("a", x) x = re_latex_lowercase_ae.sub("ae", x) x = re_latex_lowercase_e.sub("e", x) x = re_latex_lowercase_i.sub("i", x) x = re_latex_lowercase_o.sub("o", x) x = re_latex_lowercase_u.sub("u", x) x = re_latex_lowercase_y.sub("x", x) x = re_latex_lowercase_c.sub("c", x) x = re_latex_lowercase_n.sub("n", x) x = re_latex_uppercase_a.sub("A", x) x = re_latex_uppercase_ae.sub("AE", x) x = re_latex_uppercase_e.sub("E", x) x = re_latex_uppercase_i.sub("I", x) x = re_latex_uppercase_o.sub("O", x) x = re_latex_uppercase_u.sub("U", x) x = re_latex_uppercase_y.sub("Y", x) x = re_latex_uppercase_c.sub("C", x) x = re_latex_uppercase_n.sub("N", x) # convert input into Unicode string: try: y = unicode(x, "utf-8") except: return x # something went wrong, probably the input wasn't UTF-8 # asciify Latin-1 lowercase characters: y = re_unicode_lowercase_a.sub("a", y) y = re_unicode_lowercase_ae.sub("ae", y) y = re_unicode_lowercase_e.sub("e", y) y = re_unicode_lowercase_i.sub("i", y) y = re_unicode_lowercase_o.sub("o", y) y = re_unicode_lowercase_u.sub("u", y) y = re_unicode_lowercase_y.sub("y", y) y = re_unicode_lowercase_c.sub("c", y) y = re_unicode_lowercase_n.sub("n", y) # asciify Latin-1 uppercase characters: y = re_unicode_uppercase_a.sub("A", y) y = re_unicode_uppercase_ae.sub("AE", y) y = re_unicode_uppercase_e.sub("E", y) y = re_unicode_uppercase_i.sub("I", y) y = re_unicode_uppercase_o.sub("O", y) y = re_unicode_uppercase_u.sub("U", y) y = re_unicode_uppercase_y.sub("Y", y) y = re_unicode_uppercase_c.sub("C", y) y = re_unicode_uppercase_n.sub("N", y) # return UTF-8 representation of the Unicode string: return y.encode("utf-8") def wash_index_term(term, max_char_length=50, lower_term=True): """ Return washed form of the index term TERM that would be suitable for storing into idxWORD* tables. I.e., lower the TERM if LOWER_TERM is True, and truncate it safely to MAX_CHAR_LENGTH UTF-8 characters (meaning, in principle, 4*MAX_CHAR_LENGTH bytes). The function works by an internal conversion of TERM, when needed, from its input Python UTF-8 binary string format into Python Unicode format, and then truncating it safely to the given number of UTF-8 characters, without possible mis-truncation in the middle of a multi-byte UTF-8 character that could otherwise happen if we would have been working with UTF-8 binary representation directly. Note that MAX_CHAR_LENGTH corresponds to the length of the term column in idxINDEX* tables. """ if lower_term: washed_term = unicode(term, 'utf-8').lower() else: washed_term = unicode(term, 'utf-8') if len(washed_term) <= max_char_length: # no need to truncate the term, because it will fit # nicely even if it uses four-byte UTF-8 characters return washed_term.encode('utf-8') else: # truncate the term in a safe position: return washed_term[:max_char_length].encode('utf-8') def lower_index_term(term): """ Return safely lowered index term TERM. This is done by converting to UTF-8 first, because standard Python lower() function is not UTF-8 safe. To be called by both the search engine and the indexer when appropriate (e.g. before stemming). In case of problems with UTF-8 compliance, this function raises UnicodeDecodeError, so the client code may want to catch it. """ return unicode(term, 'utf-8').lower().encode('utf-8') def wash_output_format(format): """Wash output format FORMAT. Currently only prevents input like 'of=9' for backwards-compatible format that prints certain fields only. (for this task, 'of=tm' is preferred)""" if str(format[0:3]).isdigit() and len(format) != 6: # asked to print MARC tags, but not enough digits, # so let's switch back to HTML brief default return 'hb' else: return format def wash_pattern(p): """Wash pattern passed by URL. Check for sanity of the wildcard by removing wildcards if they are appended to extremely short words (1-3 letters). TODO: instead of this approximative treatment, it will be much better to introduce a temporal limit, e.g. to kill a query if it does not finish in 10 seconds.""" # strip accents: # p = strip_accents(p) # FIXME: when available, strip accents all the time # add leading/trailing whitespace for the two following wildcard-sanity checking regexps: p = " " + p + " " # get rid of wildcards at the beginning of words: p = re_pattern_wildcards_at_beginning.sub("\\1", p) # replace spaces within quotes by __SPACE__ temporarily: p = re_pattern_single_quotes.sub(lambda x: "'"+string.replace(x.group(1), ' ', '__SPACE__')+"'", p) p = re_pattern_double_quotes.sub(lambda x: "\""+string.replace(x.group(1), ' ', '__SPACE__')+"\"", p) p = re_pattern_regexp_quotes.sub(lambda x: "/"+string.replace(x.group(1), ' ', '__SPACE__')+"/", p) # get rid of extremely short words (1-3 letters with wildcards): p = re_pattern_short_words.sub("\\1", p) # replace back __SPACE__ by spaces: p = re_pattern_space.sub(" ", p) # replace special terms: p = re_pattern_today.sub(time.strftime("%Y-%m-%d", time.localtime()), p) # remove unnecessary whitespace: p = string.strip(p) return p def wash_field(f): """Wash field passed by URL.""" # get rid of unnecessary whitespace: f = string.strip(f) # wash old-style CDS Invenio/ALEPH 'f' field argument, e.g. replaces 'wau' and 'au' by 'author' if CFG_WEBSEARCH_FIELDS_CONVERT.has_key(string.lower(f)): f = CFG_WEBSEARCH_FIELDS_CONVERT[f] return f def wash_dates(d1="", d1y=0, d1m=0, d1d=0, d2="", d2y=0, d2m=0, d2d=0): """ Take user-submitted date arguments D1 (full datetime string) or (D1Y, D1M, D1Y) year, month, day tuple and D2 or (D2Y, D2M, D2Y) and return (YYY1-M1-D2 H1:M1:S2, YYY2-M2-D2 H2:M2:S2) datetime strings in the YYYY-MM-DD HH:MM:SS format suitable for time restricted searching. Note that when both D1 and (D1Y, D1M, D1D) parameters are present, the precedence goes to D1. Ditto for D2*. Note that when (D1Y, D1M, D1D) are taken into account, some values may be missing and are completed e.g. to 01 or 12 according to whether it is the starting or the ending date. """ datetext1, datetext2 = "", "" # sanity checking: if d1 == "" and d1y == 0 and d1m == 0 and d1d == 0 and d2 == "" and d2y == 0 and d2m == 0 and d2d == 0: return ("", "") # nothing selected, so return empty values # wash first (starting) date: if d1: # full datetime string takes precedence: datetext1 = d1 else: # okay, first date passed as (year,month,day): if d1y: datetext1 += "%04d" % d1y else: datetext1 += "0000" if d1m: datetext1 += "-%02d" % d1m else: datetext1 += "-01" if d1d: datetext1 += "-%02d" % d1d else: datetext1 += "-01" datetext1 += " 00:00:00" # wash second (ending) date: if d2: # full datetime string takes precedence: datetext2 = d2 else: # okay, second date passed as (year,month,day): if d2y: datetext2 += "%04d" % d2y else: datetext2 += "9999" if d2m: datetext2 += "-%02d" % d2m else: datetext2 += "-12" if d2d: datetext2 += "-%02d" % d2d else: datetext2 += "-31" # NOTE: perhaps we should add max(datenumber) in # given month, but for our quering it's not # needed, 31 will always do datetext2 += " 00:00:00" # okay, return constructed YYYY-MM-DD HH:MM:SS datetexts: return (datetext1, datetext2) def get_colID(c): "Return collection ID for collection name C. Return None if no match found." colID = None res = run_sql("SELECT id FROM collection WHERE name=%s", (c,), 1) if res: colID = res[0][0] return colID def get_coll_ancestors(coll): "Returns a list of ancestors for collection 'coll'." coll_ancestors = [] coll_ancestor = coll while 1: res = run_sql("""SELECT c.name FROM collection AS c LEFT JOIN collection_collection AS cc ON c.id=cc.id_dad LEFT JOIN collection AS ccc ON ccc.id=cc.id_son WHERE ccc.name=%s ORDER BY cc.id_dad ASC LIMIT 1""", (coll_ancestor,)) if res: coll_name = res[0][0] coll_ancestors.append(coll_name) coll_ancestor = coll_name else: break # ancestors found, return reversed list: coll_ancestors.reverse() return coll_ancestors def get_coll_sons(coll, type='r', public_only=1): """Return a list of sons (first-level descendants) of type 'type' for collection 'coll'. If public_only, then return only non-restricted son collections. """ coll_sons = [] query = "SELECT c.name FROM collection AS c "\ "LEFT JOIN collection_collection AS cc ON c.id=cc.id_son "\ "LEFT JOIN collection AS ccc ON ccc.id=cc.id_dad "\ "WHERE cc.type=%s AND ccc.name=%s" query += " ORDER BY cc.score DESC" res = run_sql(query, (type, coll)) for name in res: if not public_only or not collection_restricted_p(name[0]): coll_sons.append(name[0]) return coll_sons def get_coll_real_descendants(coll): """Return a list of all descendants of collection 'coll' that are defined by a 'dbquery'. IOW, we need to decompose compound collections like "A & B" into "A" and "B" provided that "A & B" has no associated database query defined. """ coll_sons = [] res = run_sql("""SELECT c.name,c.dbquery FROM collection AS c LEFT JOIN collection_collection AS cc ON c.id=cc.id_son LEFT JOIN collection AS ccc ON ccc.id=cc.id_dad WHERE ccc.name=%s ORDER BY cc.score DESC""", (coll,)) for name, dbquery in res: if dbquery: # this is 'real' collection, so return it: coll_sons.append(name) else: # this is 'composed' collection, so recurse: coll_sons.extend(get_coll_real_descendants(name)) return coll_sons def browse_pattern(req, colls, p, f, rg, ln=CFG_SITE_LANG): """Browse either biliographic phrases or words indexes, and display it.""" # load the right message language _ = gettext_set_language(ln) ## is p enclosed in quotes? (coming from exact search) if p.startswith('"') and p.endswith('"'): p = p[1:-1] p_orig = p ## okay, "real browse" follows: ## FIXME: the maths in the get_nearest_terms_in_bibxxx is just a test if not f and string.find(p, ":") > 0: # does 'p' contain ':'? f, p = string.split(p, ":", 1) ## do we search in words indexes? if not f: return browse_in_bibwords(req, p, f) index_id = get_index_id_from_field(f) if index_id != 0: coll = HitSet() for coll_name in colls: coll |= get_collection_reclist(coll_name) browsed_phrases_in_colls = get_nearest_terms_in_idxphrase_with_collection(p, index_id, rg/2, rg/2, coll) else: browsed_phrases = get_nearest_terms_in_bibxxx(p, f, (rg+1)/2+1, (rg-1)/2+1) while not browsed_phrases: # try again and again with shorter and shorter pattern: try: p = p[:-1] browsed_phrases = get_nearest_terms_in_bibxxx(p, f, (rg+1)/2+1, (rg-1)/2+1) except: # probably there are no hits at all: req.write(_("No values found.")) return ## try to check hits in these particular collection selection: browsed_phrases_in_colls = [] if 0: for phrase in browsed_phrases: phrase_hitset = HitSet() phrase_hitsets = search_pattern("", phrase, f, 'e') for coll in colls: phrase_hitset.union_update(phrase_hitsets[coll]) if len(phrase_hitset) > 0: # okay, this phrase has some hits in colls, so add it: browsed_phrases_in_colls.append([phrase, len(phrase_hitset)]) ## were there hits in collections? if browsed_phrases_in_colls == []: if browsed_phrases != []: #print_warning(req, """<p>No match close to <em>%s</em> found in given collections. #Please try different term.<p>Displaying matches in any collection...""" % p_orig) ## try to get nbhits for these phrases in any collection: for phrase in browsed_phrases: browsed_phrases_in_colls.append([phrase, get_nbhits_in_bibxxx(phrase, f)]) ## display results now: out = websearch_templates.tmpl_browse_pattern( f=f, fn=get_field_i18nname(get_field_name(f) or f, ln, False), ln=ln, browsed_phrases_in_colls=browsed_phrases_in_colls, colls=colls, rg=rg, ) req.write(out) return def browse_in_bibwords(req, p, f, ln=CFG_SITE_LANG): """Browse inside words indexes.""" if not p: return _ = gettext_set_language(ln) urlargd = {} urlargd.update(req.argd) urlargd['action'] = 'search' nearest_box = create_nearest_terms_box(urlargd, p, f, 'w', ln=ln, intro_text_p=0) req.write(websearch_templates.tmpl_search_in_bibwords( p = p, f = f, ln = ln, nearest_box = nearest_box )) return def search_pattern(req=None, p=None, f=None, m=None, ap=0, of="id", verbose=0, ln=CFG_SITE_LANG): """Search for complex pattern 'p' within field 'f' according to matching type 'm'. Return hitset of recIDs. The function uses multi-stage searching algorithm in case of no exact match found. See the Search Internals document for detailed description. The 'ap' argument governs whether an alternative patterns are to be used in case there is no direct hit for (p,f,m). For example, whether to replace non-alphanumeric characters by spaces if it would give some hits. See the Search Internals document for detailed description. (ap=0 forbits the alternative pattern usage, ap=1 permits it.) The 'of' argument governs whether to print or not some information to the user in case of no match found. (Usually it prints the information in case of HTML formats, otherwise it's silent). The 'verbose' argument controls the level of debugging information to be printed (0=least, 9=most). All the parameters are assumed to have been previously washed. This function is suitable as a mid-level API. """ _ = gettext_set_language(ln) hitset_empty = HitSet() # sanity check: if not p: hitset_full = HitSet(trailing_bits=1) hitset_full.discard(0) # no pattern, so return all universe return hitset_full # search stage 1: break up arguments into basic search units: if verbose and of.startswith("h"): t1 = os.times()[4] basic_search_units = create_basic_search_units(req, p, f, m, of) if verbose and of.startswith("h"): t2 = os.times()[4] print_warning(req, "Search stage 1: basic search units are: %s" % cgi.escape(repr(basic_search_units))) print_warning(req, "Search stage 1: execution took %.2f seconds." % (t2 - t1)) # search stage 2: do search for each search unit and verify hit presence: if verbose and of.startswith("h"): t1 = os.times()[4] basic_search_units_hitsets = [] for idx_unit in xrange(len(basic_search_units)): bsu_o, bsu_p, bsu_f, bsu_m = basic_search_units[idx_unit] basic_search_unit_hitset = search_unit(bsu_p, bsu_f, bsu_m) if verbose >= 9 and of.startswith("h"): print_warning(req, "Search stage 1: pattern %s gave hitlist %s" % (cgi.escape(bsu_p), basic_search_unit_hitset)) if len(basic_search_unit_hitset) > 0 or \ ap==0 or \ bsu_o=="|" or \ ((idx_unit+1)<len(basic_search_units) and basic_search_units[idx_unit+1][0]=="|"): # stage 2-1: this basic search unit is retained, since # either the hitset is non-empty, or the approximate # pattern treatment is switched off, or the search unit # was joined by an OR operator to preceding/following # units so we do not require that it exists basic_search_units_hitsets.append(basic_search_unit_hitset) else: # stage 2-2: no hits found for this search unit, try to replace non-alphanumeric chars inside pattern: if re.search(r'[^a-zA-Z0-9\s\:]', bsu_p): if bsu_p.startswith('"') and bsu_p.endswith('"'): # is it ACC query? bsu_pn = re.sub(r'[^a-zA-Z0-9\s\:]+', "*", bsu_p) else: # it is WRD query bsu_pn = re.sub(r'[^a-zA-Z0-9\s\:]+', " ", bsu_p) if verbose and of.startswith('h') and req: print_warning(req, "Trying (%s,%s,%s)" % (cgi.escape(bsu_pn), cgi.escape(bsu_f), cgi.escape(bsu_m))) basic_search_unit_hitset = search_pattern(req=None, p=bsu_pn, f=bsu_f, m=bsu_m, of="id", ln=ln) if len(basic_search_unit_hitset) > 0: # we retain the new unit instead if of.startswith('h'): print_warning(req, _("No exact match found for %(x_query1)s, using %(x_query2)s instead...") % \ {'x_query1': "<em>" + cgi.escape(bsu_p) + "</em>", 'x_query2': "<em>" + cgi.escape(bsu_pn) + "</em>"}) basic_search_units[idx_unit][1] = bsu_pn basic_search_units_hitsets.append(basic_search_unit_hitset) else: # stage 2-3: no hits found either, propose nearest indexed terms: if of.startswith('h'): if req: if bsu_f == "recid": print_warning(req, "Requested record does not seem to exist.") else: print_warning(req, create_nearest_terms_box(req.argd, bsu_p, bsu_f, bsu_m, ln=ln)) return hitset_empty else: # stage 2-3: no hits found either, propose nearest indexed terms: if of.startswith('h'): if req: if bsu_f == "recid": print_warning(req, "Requested record does not seem to exist.") else: print_warning(req, create_nearest_terms_box(req.argd, bsu_p, bsu_f, bsu_m, ln=ln)) return hitset_empty if verbose and of.startswith("h"): t2 = os.times()[4] for idx_unit in range(0, len(basic_search_units)): print_warning(req, "Search stage 2: basic search unit %s gave %d hits." % (basic_search_units[idx_unit][1:], len(basic_search_units_hitsets[idx_unit]))) print_warning(req, "Search stage 2: execution took %.2f seconds." % (t2 - t1)) # search stage 3: apply boolean query for each search unit: if verbose and of.startswith("h"): t1 = os.times()[4] # let the initial set be the complete universe: hitset_in_any_collection = HitSet(trailing_bits=1) hitset_in_any_collection.discard(0) for idx_unit in xrange(len(basic_search_units)): this_unit_operation = basic_search_units[idx_unit][0] this_unit_hitset = basic_search_units_hitsets[idx_unit] if this_unit_operation == '+': hitset_in_any_collection.intersection_update(this_unit_hitset) elif this_unit_operation == '-': hitset_in_any_collection.difference_update(this_unit_hitset) elif this_unit_operation == '|': hitset_in_any_collection.union_update(this_unit_hitset) else: if of.startswith("h"): print_warning(req, "Invalid set operation %s." % cgi.escape(this_unit_operation), "Error") if len(hitset_in_any_collection) == 0: # no hits found, propose alternative boolean query: if of.startswith('h'): nearestterms = [] for idx_unit in range(0, len(basic_search_units)): bsu_o, bsu_p, bsu_f, bsu_m = basic_search_units[idx_unit] if bsu_p.startswith("%") and bsu_p.endswith("%"): bsu_p = "'" + bsu_p[1:-1] + "'" bsu_nbhits = len(basic_search_units_hitsets[idx_unit]) # create a similar query, but with the basic search unit only argd = {} argd.update(req.argd) argd['p'] = bsu_p argd['f'] = bsu_f nearestterms.append((bsu_p, bsu_nbhits, argd)) text = websearch_templates.tmpl_search_no_boolean_hits( ln=ln, nearestterms=nearestterms) print_warning(req, text) if verbose and of.startswith("h"): t2 = os.times()[4] print_warning(req, "Search stage 3: boolean query gave %d hits." % len(hitset_in_any_collection)) print_warning(req, "Search stage 3: execution took %.2f seconds." % (t2 - t1)) return hitset_in_any_collection def search_pattern_parenthesised(req=None, p=None, f=None, m=None, ap=0, of="id", verbose=0, ln=CFG_SITE_LANG): """Search for complex pattern 'p' containing parenthesis within field 'f' according to matching type 'm'. Return hitset of recIDs. For more details on the parameters see 'search_pattern' """ _ = gettext_set_language(ln) # if the pattern uses SPIRES search syntax, convert it to Invenio syntax spires_syntax_converter = SpiresToInvenioSyntaxConverter() p = spires_syntax_converter.convert_query(p) # sanity check: do not call parenthesised parser for search terms # like U(1): if not re_pattern_parens.search(p): return search_pattern(req, p, f, m, ap, of, verbose, ln) # Try searching with parentheses try: parser = SearchQueryParenthesisedParser() # get a hitset with all recids result_hitset = HitSet(trailing_bits=1) # parse the query. The result is list of [op1, expr1, op2, expr2, ..., opN, exprN] parsing_result = parser.parse_query(p) if verbose and of.startswith("h"): print_warning(req, "Search stage 1: search_pattern_parenthesised() returned %s." % repr(parsing_result)) # go through every pattern # calculate hitset for it # combine pattern's hitset with the result using the corresponding operator for index in xrange(0, len(parsing_result)-1, 2 ): current_operator = parsing_result[index] current_pattern = parsing_result[index+1] # obtain a hitset for the current patter current_hitset = search_pattern(req, current_pattern, f, m, ap, of, verbose, ln) # combine the current hitset with resulting hitset using the current operator if current_operator == '+': result_hitset = result_hitset & current_hitset elif current_operator == '-': result_hitset = result_hitset - current_hitset elif current_operator == '|': result_hitset = result_hitset | current_hitset else: assert False, "Unknown operator in search_pattern_parenthesised()" return result_hitset # If searching with parenteses fails, perform search ignoring parentheses except InvenioWebSearchQueryParserException: print_warning(req, _("Nested or mismatched parentheses detected. Ignoring all parentheses in the query...")) # remove the parentheses in the query. Current implementation removes all the parentheses, # but it could be improved to romove only these that are not insede quotes p = p.replace('(', ' ') p = p.replace(')', ' ') return search_pattern(req, p, f, m, ap, of, verbose, ln) def search_unit(p, f=None, m=None): """Search for basic search unit defined by pattern 'p' and field 'f' and matching type 'm'. Return hitset of recIDs. All the parameters are assumed to have been previously washed. 'p' is assumed to be already a ``basic search unit'' so that it is searched as such and is not broken up in any way. Only wildcard and span queries are being detected inside 'p'. This function is suitable as a low-level API. """ ## create empty output results set: set = HitSet() if not p: # sanity checking return set if m == 'a' or m == 'r': # we are doing either phrase search or regexp search index_id = get_index_id_from_field(f) if index_id != 0: set = search_unit_in_idxphrases(p, f, m) else: set = search_unit_in_bibxxx(p, f, m) elif p.startswith("cited:"): # we are doing search by the citation count set = search_unit_by_times_cited(p[6:]) else: # we are doing bibwords search by default set = search_unit_in_bibwords(p, f) return set def search_unit_in_bibwords(word, f, decompress=zlib.decompress): """Searches for 'word' inside bibwordsX table for field 'f' and returns hitset of recIDs.""" set = HitSet() # will hold output result set set_used = 0 # not-yet-used flag, to be able to circumvent set operations # deduce into which bibwordsX table we will search: stemming_language = get_index_stemming_language(get_index_id_from_field("anyfield")) bibwordsX = "idxWORD%02dF" % get_index_id_from_field("anyfield") if f: index_id = get_index_id_from_field(f) if index_id: bibwordsX = "idxWORD%02dF" % index_id stemming_language = get_index_stemming_language(index_id) else: return HitSet() # word index f does not exist # wash 'word' argument and run query: word = string.replace(word, '*', '%') # we now use '*' as the truncation character words = string.split(word, "->", 1) # check for span query if len(words) == 2: word0 = re_word.sub('', words[0]) word1 = re_word.sub('', words[1]) if stemming_language: word0 = lower_index_term(word0) word1 = lower_index_term(word1) word0 = stem(word0, stemming_language) word1 = stem(word1, stemming_language) res = run_sql("SELECT term,hitlist FROM %s WHERE term BETWEEN %%s AND %%s" % bibwordsX, (wash_index_term(word0), wash_index_term(word1))) else: if f == 'journal': pass # FIXME: quick hack for the journal index else: word = re_word.sub('', word) if stemming_language: word = lower_index_term(word) word = stem(word, stemming_language) if string.find(word, '%') >= 0: # do we have wildcard in the word? if f == 'journal': # FIXME: quick hack for the journal index # FIXME: we can run a sanity check here for all indexes res = () else: res = run_sql("SELECT term,hitlist FROM %s WHERE term LIKE %%s" % bibwordsX, (wash_index_term(word),)) else: res = run_sql("SELECT term,hitlist FROM %s WHERE term=%%s" % bibwordsX, (wash_index_term(word),)) # fill the result set: for word, hitlist in res: hitset_bibwrd = HitSet(hitlist) # add the results: if set_used: set.union_update(hitset_bibwrd) else: set = hitset_bibwrd set_used = 1 # okay, return result set: return set def search_unit_in_idxphrases(p, f, type): """Searches for phrase 'p' inside idxPHRASE*F table for field 'f' and returns hitset of recIDs found. The search type is defined by 'type' (e.g. equals to 'r' for a regexp search).""" set = HitSet() # will hold output result set set_used = 0 # not-yet-used flag, to be able to circumvent set operations # deduce in which idxPHRASE table we will search: idxphraseX = "idxPHRASE%02dF" % get_index_id_from_field("anyfield") if f: index_id = get_index_id_from_field(f) if index_id: idxphraseX = "idxPHRASE%02dF" % index_id else: return HitSet() # phrase index f does not exist # detect query type (exact phrase, partial phrase, regexp): if type == 'r': query_addons = "REGEXP %s" query_params = (p,) else: p = string.replace(p, '*', '%') # we now use '*' as the truncation character ps = string.split(p, "->", 1) # check for span query: if len(ps) == 2: query_addons = "BETWEEN %s AND %s" query_params = (ps[0], ps[1]) else: if string.find(p, '%') > -1: query_addons = "LIKE %s" query_params = (ps[0],) else: query_addons = "= %s" query_params = (ps[0],) # perform search: res = run_sql("SELECT term,hitlist FROM %s WHERE term %s" % (idxphraseX, query_addons), query_params) # fill the result set: for word, hitlist in res: hitset_bibphrase = HitSet(hitlist) # add the results: if set_used: set.union_update(hitset_bibphrase) else: set = hitset_bibphrase set_used = 1 # okay, return result set: return set def search_unit_in_bibxxx(p, f, type): """Searches for pattern 'p' inside bibxxx tables for field 'f' and returns hitset of recIDs found. The search type is defined by 'type' (e.g. equals to 'r' for a regexp search).""" # FIXME: quick hack for the journal index if f == 'journal': return search_unit_in_bibwords(p, f) p_orig = p # saving for eventual future 'no match' reporting query_addons = "" # will hold additional SQL code for the query query_params = () # will hold parameters for the query (their number may vary depending on TYPE argument) # wash arguments: f = string.replace(f, '*', '%') # replace truncation char '*' in field definition if type == 'r': query_addons = "REGEXP %s" query_params = (p,) else: p = string.replace(p, '*', '%') # we now use '*' as the truncation character ps = string.split(p, "->", 1) # check for span query: if len(ps) == 2: query_addons = "BETWEEN %s AND %s" query_params = (ps[0], ps[1]) else: if string.find(p, '%') > -1: query_addons = "LIKE %s" query_params = (ps[0],) else: query_addons = "= %s" query_params = (ps[0],) # construct 'tl' which defines the tag list (MARC tags) to search in: tl = [] if str(f[0]).isdigit() and str(f[1]).isdigit(): tl.append(f) # 'f' seems to be okay as it starts by two digits else: # convert old ALEPH tag names, if appropriate: (TODO: get rid of this before entering this function) if CFG_WEBSEARCH_FIELDS_CONVERT.has_key(string.lower(f)): f = CFG_WEBSEARCH_FIELDS_CONVERT[string.lower(f)] # deduce desired MARC tags on the basis of chosen 'f' tl = get_field_tags(f) if not tl: # f index does not exist, nevermind pass # okay, start search: l = [] # will hold list of recID that matched for t in tl: # deduce into which bibxxx table we will search: digit1, digit2 = int(t[0]), int(t[1]) bx = "bib%d%dx" % (digit1, digit2) bibx = "bibrec_bib%d%dx" % (digit1, digit2) # construct and run query: if t == "001": res = run_sql("SELECT id FROM bibrec WHERE id %s" % query_addons, query_params) else: query = "SELECT bibx.id_bibrec FROM %s AS bx LEFT JOIN %s AS bibx ON bx.id=bibx.id_bibxxx WHERE bx.value %s" % \ (bx, bibx, query_addons) if len(t) != 6 or t[-1:]=='%': # wildcard query, or only the beginning of field 't' # is defined, so add wildcard character: query += " AND bx.tag LIKE %s" res = run_sql(query, query_params + (t + '%',)) else: # exact query for 't': query += " AND bx.tag=%s" res = run_sql(query, query_params + (t,)) # fill the result set: for id_bibrec in res: if id_bibrec[0]: l.append(id_bibrec[0]) # check no of hits found: nb_hits = len(l) # okay, return result set: set = HitSet(l) return set def search_unit_in_bibrec(datetext1, datetext2, type='c'): """ Return hitset of recIDs found that were either created or modified (according to 'type' arg being 'c' or 'm') from datetext1 until datetext2, inclusive. Does not pay attention to pattern, collection, anything. Useful to intersect later on with the 'real' query. """ set = HitSet() if type.startswith("m"): type = "modification_date" else: type = "creation_date" # by default we are searching for creation dates res = run_sql("SELECT id FROM bibrec WHERE %s>=%%s AND %s<=%%s" % (type, type), (datetext1, datetext2)) for row in res: set += row[0] return set def search_unit_by_times_cited(p): """ Return histset of recIDs found that are cited P times. Usually P looks like '10->23'. """ numstr = '"'+p+'"' #this is sort of stupid but since we may need to #get the records that do _not_ have cites, we have to #know the ids of all records, too #but this is needed only if bsu_p is 0 or 0 or 0->0 allrecs = [] if p == 0 or p == "0" or \ p.startswith("0->") or p.endswith("->0"): allrecs = HitSet(run_sql_cached("SELECT id FROM bibrec", affected_tables=['bibrec'])) return get_records_with_num_cites(numstr, allrecs) def intersect_results_with_collrecs(req, hitset_in_any_collection, colls, ap=0, of="hb", verbose=0, ln=CFG_SITE_LANG): """Return dict of hitsets given by intersection of hitset with the collection universes.""" _ = gettext_set_language(ln) # search stage 4: intersect with the collection universe: if verbose and of.startswith("h"): t1 = os.times()[4] results = {} results_nbhits = 0 for coll in colls: results[coll] = hitset_in_any_collection & get_collection_reclist(coll) results_nbhits += len(results[coll]) if results_nbhits == 0: # no hits found, try to search in Home: results_in_Home = hitset_in_any_collection & get_collection_reclist(CFG_SITE_NAME) if len(results_in_Home) > 0: # some hits found in Home, so propose this search: if of.startswith("h"): url = websearch_templates.build_search_url(req.argd, cc=CFG_SITE_NAME, c=[]) print_warning(req, _("No match found in collection %(x_collection)s. Other public collections gave %(x_url_open)s%(x_nb_hits)d hits%(x_url_close)s.") %\ {'x_collection': '<em>' + string.join([get_coll_i18nname(coll, ln, False) for coll in colls], ', ') + '</em>', 'x_url_open': '<a class="nearestterms" href="%s">' % (url), 'x_nb_hits': len(results_in_Home), 'x_url_close': '</a>'}) results = {} else: # no hits found in Home, recommend different search terms: if of.startswith("h"): print_warning(req, _("No public collection matched your query. " "If you were looking for a non-public document, please choose " "the desired restricted collection first.")) results = {} if verbose and of.startswith("h"): t2 = os.times()[4] print_warning(req, "Search stage 4: intersecting with collection universe gave %d hits." % results_nbhits) print_warning(req, "Search stage 4: execution took %.2f seconds." % (t2 - t1)) return results def intersect_results_with_hitset(req, results, hitset, ap=0, aptext="", of="hb"): """Return intersection of search 'results' (a dict of hitsets with collection as key) with the 'hitset', i.e. apply 'hitset' intersection to each collection within search 'results'. If the final 'results' set is to be empty, and 'ap' (approximate pattern) is true, and then print the `warningtext' and return the original 'results' set unchanged. If 'ap' is false, then return empty results set. """ if ap: results_ap = copy.deepcopy(results) else: results_ap = {} # will return empty dict in case of no hits found nb_total = 0 for coll in results.keys(): results[coll].intersection_update(hitset) nb_total += len(results[coll]) if nb_total == 0: if of.startswith("h"): print_warning(req, aptext) results = results_ap return results def create_similarly_named_authors_link_box(author_name, ln=CFG_SITE_LANG): """Return a box similar to ``Not satisfied...'' one by proposing author searches for similar names. Namely, take AUTHOR_NAME and the first initial of the firstame (after comma) and look into author index whether authors with e.g. middle names exist. Useful mainly for CERN Library that sometimes contains name forms like Ellis-N, Ellis-Nick, Ellis-Nicolas all denoting the same person. The box isn't proposed if no similarly named authors are found to exist. """ # return nothing if not configured: if CFG_WEBSEARCH_CREATE_SIMILARLY_NAMED_AUTHORS_LINK_BOX == 0: return "" # return empty box if there is no initial: if re.match(r'[^ ,]+, [^ ]', author_name) is None: return "" # firstly find name comma initial: author_name_to_search = re.sub(r'^([^ ,]+, +[^ ,]).*$', '\\1', author_name) # secondly search for similar name forms: similar_author_names = {} for name in author_name_to_search, strip_accents(author_name_to_search): for tag in get_field_tags("author"): # deduce into which bibxxx table we will search: digit1, digit2 = int(tag[0]), int(tag[1]) bx = "bib%d%dx" % (digit1, digit2) bibx = "bibrec_bib%d%dx" % (digit1, digit2) if len(tag) != 6 or tag[-1:]=='%': # only the beginning of field 't' is defined, so add wildcard character: res = run_sql("""SELECT bx.value FROM %s AS bx WHERE bx.value LIKE %%s AND bx.tag LIKE %%s""" % bx, (name + "%", tag + "%")) else: res = run_sql("""SELECT bx.value FROM %s AS bx WHERE bx.value LIKE %%s AND bx.tag=%%s""" % bx, (name + "%", tag)) for row in res: similar_author_names[row[0]] = 1 # remove the original name and sort the list: try: del similar_author_names[author_name] except KeyError: pass # thirdly print the box: out = "" if similar_author_names: out_authors = similar_author_names.keys() out_authors.sort() tmp_authors = [] for out_author in out_authors: nbhits = get_nbhits_in_bibxxx(out_author, "author") if nbhits: tmp_authors.append((out_author, nbhits)) out += websearch_templates.tmpl_similar_author_names( authors=tmp_authors, ln=ln) return out def create_nearest_terms_box(urlargd, p, f, t='w', n=5, ln=CFG_SITE_LANG, intro_text_p=True): """Return text box containing list of 'n' nearest terms above/below 'p' for the field 'f' for matching type 't' (words/phrases) in language 'ln'. Propose new searches according to `urlargs' with the new words. If `intro_text_p' is true, then display the introductory message, otherwise print only the nearest terms in the box content. """ # load the right message language _ = gettext_set_language(ln) out = "" nearest_terms = [] if not p: # sanity check p = "." index_id = get_index_id_from_field(f) # look for nearest terms: if t == 'w': nearest_terms = get_nearest_terms_in_bibwords(p, f, n, n) if not nearest_terms: return _("No word index is available for %s.") % \ ('<em>' + cgi.escape(get_field_i18nname(get_field_name(f) or f, ln, False)) + '</em>') else: nearest_terms = [] if index_id: nearest_terms = get_nearest_terms_in_idxphrase(p, index_id, n, n) if not nearest_terms: nearest_terms = get_nearest_terms_in_bibxxx(p, f, n, n) if not nearest_terms: return _("No phrase index is available for %s.") % \ ('<em>' + cgi.escape(get_field_i18nname(get_field_name(f) or f, ln, False)) + '</em>') terminfo = [] for term in nearest_terms: if t == 'w': hits = get_nbhits_in_bibwords(term, f) else: if index_id: hits = get_nbhits_in_idxphrases(term, f) else: hits = get_nbhits_in_bibxxx(term, f) argd = {} argd.update(urlargd) # check which fields contained the requested parameter, and replace it. for (px, fx) in ('p', 'f'), ('p1', 'f1'), ('p2', 'f2'), ('p3', 'f3'): if px in argd: argd_px = argd[px] if t == 'w': # p was stripped of accents, to do the same: argd_px = strip_accents(argd_px) if f == argd[fx] or f == "anyfield" or f == "": if string.find(argd_px, p) > -1: argd[px] = string.replace(argd_px, p, term) break else: if string.find(argd_px, f+':'+p) > -1: argd[px] = string.replace(argd_px, f+':'+p, f+':'+term) break elif string.find(argd_px, f+':"'+p+'"') > -1: argd[px] = string.replace(argd_px, f+':"'+p+'"', f+':"'+term+'"') break terminfo.append((term, hits, argd)) intro = "" if intro_text_p: # add full leading introductory text if f: intro = _("Search term %(x_term)s inside index %(x_index)s did not match any record. Nearest terms in any collection are:") % \ {'x_term': "<em>" + cgi.escape(p.startswith("%") and p.endswith("%") and p[1:-1] or p) + "</em>", 'x_index': "<em>" + cgi.escape(get_field_i18nname(get_field_name(f) or f, ln, False)) + "</em>"} else: intro = _("Search term %s did not match any record. Nearest terms in any collection are:") % \ ("<em>" + cgi.escape(p.startswith("%") and p.endswith("%") and p[1:-1] or p) + "</em>") return websearch_templates.tmpl_nearest_term_box(p=p, ln=ln, f=f, terminfo=terminfo, intro=intro) def get_nearest_terms_in_bibwords(p, f, n_below, n_above): """Return list of +n -n nearest terms to word `p' in index for field `f'.""" nearest_words = [] # will hold the (sorted) list of nearest words to return # deduce into which bibwordsX table we will search: bibwordsX = "idxWORD%02dF" % get_index_id_from_field("anyfield") if f: index_id = get_index_id_from_field(f) if index_id: bibwordsX = "idxWORD%02dF" % index_id else: return nearest_words # firstly try to get `n' closest words above `p': res = run_sql("SELECT term FROM %s WHERE term<%%s ORDER BY term DESC LIMIT %%s" % bibwordsX, (p, n_above)) for row in res: nearest_words.append(row[0]) nearest_words.reverse() # secondly insert given word `p': nearest_words.append(p) # finally try to get `n' closest words below `p': res = run_sql("SELECT term FROM %s WHERE term>%%s ORDER BY term ASC LIMIT %%s" % bibwordsX, (p, n_below)) for row in res: nearest_words.append(row[0]) return nearest_words def get_nearest_terms_in_idxphrase(p, index_id, n_below, n_above): """Browse (-n_above, +n_below) closest bibliographic phrases for the given pattern p in the given field idxPHRASE table, regardless of collection. Return list of [phrase1, phrase2, ... , phrase_n].""" idxphraseX = "idxPHRASE%02dF" % index_id res_above = run_sql("SELECT term FROM %s WHERE term<%%s ORDER BY term DESC LIMIT %%s" % idxphraseX, (p, n_above)) res_above = map(lambda x: x[0], res_above) res_above.reverse() res_below = run_sql("SELECT term FROM %s WHERE term>=%%s ORDER BY term ASC LIMIT %%s" % idxphraseX, (p, n_below)) res_below = map(lambda x: x[0], res_below) return res_above + res_below def get_nearest_terms_in_idxphrase_with_collection(p, index_id, n_below, n_above, collection): """Browse (-n_above, +n_below) closest bibliographic phrases for the given pattern p in the given field idxPHRASE table, considering the collection (HitSet). Return list of [(phrase1, hitset), (phrase2, hitset), ... , (phrase_n, hitset)].""" idxphraseX = "idxPHRASE%02dF" % index_id res_above = run_sql("SELECT term,hitlist FROM %s WHERE term<%%s ORDER BY term DESC LIMIT %%s" % idxphraseX, (p, n_above * 3)) res_above = [(term, HitSet(hitlist) & collection) for term, hitlist in res_above] res_above = [(term, len(hitlist)) for term, hitlist in res_above if hitlist] res_below = run_sql("SELECT term,hitlist FROM %s WHERE term>=%%s ORDER BY term ASC LIMIT %%s" % idxphraseX, (p, n_below * 3)) res_below = [(term, HitSet(hitlist) & collection) for term, hitlist in res_below] res_below = [(term, len(hitlist)) for term, hitlist in res_below if hitlist] res_above.reverse() return res_above[-n_above:] + res_below[:n_below] def get_nearest_terms_in_bibxxx(p, f, n_below, n_above): """Browse (-n_above, +n_below) closest bibliographic phrases for the given pattern p in the given field f, regardless of collection. Return list of [phrase1, phrase2, ... , phrase_n].""" ## determine browse field: if not f and string.find(p, ":") > 0: # does 'p' contain ':'? f, p = string.split(p, ":", 1) # FIXME: quick hack for the journal index if f == 'journal': return get_nearest_terms_in_bibwords(p, f, n_below, n_above) ## We are going to take max(n_below, n_above) as the number of ## values to ferch from bibXXx. This is needed to work around ## MySQL UTF-8 sorting troubles in 4.0.x. Proper solution is to ## use MySQL 4.1.x or our own idxPHRASE in the future. index_id = get_index_id_from_field(f) if index_id: return get_nearest_terms_in_idxphrase(p, index_id, n_below, n_above) n_fetch = 2*max(n_below, n_above) ## construct 'tl' which defines the tag list (MARC tags) to search in: tl = [] if str(f[0]).isdigit() and str(f[1]).isdigit(): tl.append(f) # 'f' seems to be okay as it starts by two digits else: # deduce desired MARC tags on the basis of chosen 'f' tl = get_field_tags(f) ## start browsing to fetch list of hits: browsed_phrases = {} # will hold {phrase1: 1, phrase2: 1, ..., phraseN: 1} dict of browsed phrases (to make them unique) # always add self to the results set: browsed_phrases[p.startswith("%") and p.endswith("%") and p[1:-1] or p] = 1 for t in tl: # deduce into which bibxxx table we will search: digit1, digit2 = int(t[0]), int(t[1]) bx = "bib%d%dx" % (digit1, digit2) bibx = "bibrec_bib%d%dx" % (digit1, digit2) # firstly try to get `n' closest phrases above `p': if len(t) != 6 or t[-1:]=='%': # only the beginning of field 't' is defined, so add wildcard character: res = run_sql("""SELECT bx.value FROM %s AS bx WHERE bx.value<%%s AND bx.tag LIKE %%s ORDER BY bx.value DESC LIMIT %%s""" % bx, (p, t + "%", n_fetch)) else: res = run_sql("""SELECT bx.value FROM %s AS bx WHERE bx.value<%%s AND bx.tag=%%s ORDER BY bx.value DESC LIMIT %%s""" % bx, (p, t, n_fetch)) for row in res: browsed_phrases[row[0]] = 1 # secondly try to get `n' closest phrases equal to or below `p': if len(t) != 6 or t[-1:]=='%': # only the beginning of field 't' is defined, so add wildcard character: res = run_sql("""SELECT bx.value FROM %s AS bx WHERE bx.value>=%%s AND bx.tag LIKE %%s ORDER BY bx.value ASC LIMIT %%s""" % bx, (p, t + "%", n_fetch)) else: res = run_sql("""SELECT bx.value FROM %s AS bx WHERE bx.value>=%%s AND bx.tag=%%s ORDER BY bx.value ASC LIMIT %%s""" % bx, (p, t, n_fetch)) for row in res: browsed_phrases[row[0]] = 1 # select first n words only: (this is needed as we were searching # in many different tables and so aren't sure we have more than n # words right; this of course won't be needed when we shall have # one ACC table only for given field): phrases_out = browsed_phrases.keys() phrases_out.sort(lambda x, y: cmp(string.lower(strip_accents(x)), string.lower(strip_accents(y)))) # find position of self: try: idx_p = phrases_out.index(p) except: idx_p = len(phrases_out)/2 # return n_above and n_below: return phrases_out[max(0, idx_p-n_above):idx_p+n_below] def get_nbhits_in_bibwords(word, f): """Return number of hits for word 'word' inside words index for field 'f'.""" out = 0 # deduce into which bibwordsX table we will search: bibwordsX = "idxWORD%02dF" % get_index_id_from_field("anyfield") if f: index_id = get_index_id_from_field(f) if index_id: bibwordsX = "idxWORD%02dF" % index_id else: return 0 if word: res = run_sql("SELECT hitlist FROM %s WHERE term=%%s" % bibwordsX, (word,)) for hitlist in res: out += len(HitSet(hitlist[0])) return out def get_nbhits_in_idxphrases(word, f): """Return number of hits for word 'word' inside phrase index for field 'f'.""" out = 0 # deduce into which bibwordsX table we will search: idxphraseX = "idxPHRASE%02dF" % get_index_id_from_field("anyfield") if f: index_id = get_index_id_from_field(f) if index_id: idxphraseX = "idxPHRASE%02dF" % index_id else: return 0 if word: res = run_sql("SELECT hitlist FROM %s WHERE term=%%s" % idxphraseX, (word,)) for hitlist in res: out += len(HitSet(hitlist[0])) return out def get_nbhits_in_bibxxx(p, f): """Return number of hits for word 'word' inside words index for field 'f'.""" ## determine browse field: if not f and string.find(p, ":") > 0: # does 'p' contain ':'? f, p = string.split(p, ":", 1) # FIXME: quick hack for the journal index if f == 'journal': return get_nbhits_in_bibwords(p, f) ## construct 'tl' which defines the tag list (MARC tags) to search in: tl = [] if str(f[0]).isdigit() and str(f[1]).isdigit(): tl.append(f) # 'f' seems to be okay as it starts by two digits else: # deduce desired MARC tags on the basis of chosen 'f' tl = get_field_tags(f) # start searching: recIDs = {} # will hold dict of {recID1: 1, recID2: 1, ..., } (unique recIDs, therefore) for t in tl: # deduce into which bibxxx table we will search: digit1, digit2 = int(t[0]), int(t[1]) bx = "bib%d%dx" % (digit1, digit2) bibx = "bibrec_bib%d%dx" % (digit1, digit2) if len(t) != 6 or t[-1:]=='%': # only the beginning of field 't' is defined, so add wildcard character: res = run_sql("""SELECT bibx.id_bibrec FROM %s AS bibx, %s AS bx WHERE bx.value=%%s AND bx.tag LIKE %%s AND bibx.id_bibxxx=bx.id""" % (bibx, bx), (p, t + "%")) else: res = run_sql("""SELECT bibx.id_bibrec FROM %s AS bibx, %s AS bx WHERE bx.value=%%s AND bx.tag=%%s AND bibx.id_bibxxx=bx.id""" % (bibx, bx), (p, t)) for row in res: recIDs[row[0]] = 1 return len(recIDs) def get_mysql_recid_from_aleph_sysno(sysno): """Returns DB's recID for ALEPH sysno passed in the argument (e.g. "002379334CER"). Returns None in case of failure.""" out = None res = run_sql("""SELECT bb.id_bibrec FROM bibrec_bib97x AS bb, bib97x AS b WHERE b.value=%s AND b.tag='970__a' AND bb.id_bibxxx=b.id""", (sysno,)) if res: out = res[0][0] return out def guess_primary_collection_of_a_record(recID): """Return primary collection name a record recid belongs to, by testing 980 identifier. May lead to bad guesses when a collection is defined dynamically via dbquery. In that case, return 'CFG_SITE_NAME'.""" out = CFG_SITE_NAME dbcollids = get_fieldvalues(recID, "980__a") if dbcollids: dbquery = "collection:" + dbcollids[0] res = run_sql("SELECT name FROM collection WHERE dbquery=%s", (dbquery,)) if res: out = res[0][0] return out _re_collection_url = re.compile('/collection/(.+)') def guess_collection_of_a_record(recID, referer=None): """Return collection name a record recid belongs to, by first testing the referer URL if provided and otherwise returning the primary collection.""" if referer: dummy, hostname, path, dummy, query, dummy = urlparse.urlparse(referer) g = _re_collection_url.match(path) if g: name = urllib.unquote_plus(g.group(1)) if recID in get_collection_reclist(name): return name elif path.startswith('/search'): query = cgi.parse_qs(query) for name in query.get('cc', []) + query.get('c', []): if recID in get_collection_reclist(name): return name return guess_primary_collection_of_a_record(recID) def get_all_collections_of_a_record(recID): """Return all the collection names a record belongs to. Note this function is O(n_collections).""" ret = [] for name in collection_reclist_cache.cache.keys(): if recID in get_collection_reclist(name): ret.append(name) return ret def get_tag_name(tag_value, prolog="", epilog=""): """Return tag name from the known tag value, by looking up the 'tag' table. Return empty string in case of failure. Example: input='100__%', output=first author'.""" out = "" res = run_sql_cached("SELECT name FROM tag WHERE value=%s", (tag_value,), affected_tables=['tag',]) if res: out = prolog + res[0][0] + epilog return out def get_fieldcodes(): """Returns a list of field codes that may have been passed as 'search options' in URL. Example: output=['subject','division'].""" out = [] res = run_sql_cached("SELECT DISTINCT(code) FROM field", affected_tables=['field',]) for row in res: out.append(row[0]) return out def get_field_name(code): """Return the corresponding field_name given the field code. e.g. reportnumber -> report number.""" res = run_sql_cached("SELECT name FROM field WHERE code=%s", (code, ), affected_tables=['field',]) if res: return res[0][0] else: return "" def get_field_tags(field): """Returns a list of MARC tags for the field code 'field'. Returns empty list in case of error. Example: field='author', output=['100__%','700__%'].""" out = [] query = """SELECT t.value FROM tag AS t, field_tag AS ft, field AS f WHERE f.code=%s AND ft.id_field=f.id AND t.id=ft.id_tag ORDER BY ft.score DESC""" res = run_sql(query, (field, )) for val in res: out.append(val[0]) return out def get_fieldvalues(recIDs, tag, repetitive_values=True): """ Return list of field values for field TAG for the given record ID or list of record IDs. (RECIDS can be both an integer or a list of integers.) If REPETITIVE_VALUES is set to True, then return all values even if they are doubled. If set to False, then return unique values only. """ out = [] if isinstance(recIDs, (int, long)): recIDs =[recIDs,] if not isinstance(recIDs, (list, tuple)): return [] if len(recIDs) == 0: return [] if tag == "001___": # we have asked for tag 001 (=recID) that is not stored in bibXXx tables out = [str(recID) for recID in recIDs] else: # we are going to look inside bibXXx tables digits = tag[0:2] try: intdigits = int(digits) if intdigits < 0 or intdigits > 99: raise ValueError except ValueError: # invalid tag value asked for return [] bx = "bib%sx" % digits bibx = "bibrec_bib%sx" % digits queryparam = [] for recID in recIDs: queryparam.append(recID) if not repetitive_values: queryselect = "DISTINCT(bx.value)" else: queryselect = "bx.value" query = "SELECT %s FROM %s AS bx, %s AS bibx WHERE bibx.id_bibrec IN (%s) " \ " AND bx.id=bibx.id_bibxxx AND bx.tag LIKE %%s " \ " ORDER BY bibx.field_number, bx.tag ASC" % \ (queryselect, bx, bibx, ("%s,"*len(queryparam))[:-1]) res = run_sql(query, tuple(queryparam) + (tag,)) for row in res: out.append(row[0]) return out def get_fieldvalues_alephseq_like(recID, tags_in): """Return buffer of ALEPH sequential-like textual format with fields found in the list TAGS_IN for record RECID.""" out = "" if type(tags_in) is not list: tags_in = [tags_in,] if len(tags_in) == 1 and len(tags_in[0]) == 6: ## case A: one concrete subfield asked, so print its value if found ## (use with care: can false you if field has multiple occurrences) out += string.join(get_fieldvalues(recID, tags_in[0]),"\n") else: ## case B: print our "text MARC" format; works safely all the time # find out which tags to output: dict_of_tags_out = {} if not tags_in: for i in range(0, 10): for j in range(0, 10): dict_of_tags_out["%d%d%%" % (i, j)] = 1 else: for tag in tags_in: if len(tag) == 0: for i in range(0, 10): for j in range(0, 10): dict_of_tags_out["%d%d%%" % (i, j)] = 1 elif len(tag) == 1: for j in range(0, 10): dict_of_tags_out["%s%d%%" % (tag, j)] = 1 elif len(tag) < 5: dict_of_tags_out["%s%%" % tag] = 1 elif tag >= 6: dict_of_tags_out[tag[0:5]] = 1 tags_out = dict_of_tags_out.keys() tags_out.sort() # search all bibXXx tables as needed: for tag in tags_out: digits = tag[0:2] try: intdigits = int(digits) if intdigits < 0 or intdigits > 99: raise ValueError except ValueError: # invalid tag value asked for continue if tag.startswith("001") or tag.startswith("00%"): if out: out += "\n" out += "%09d %s %d" % (recID, "001__", recID) bx = "bib%sx" % digits bibx = "bibrec_bib%sx" % digits query = "SELECT b.tag,b.value,bb.field_number FROM %s AS b, %s AS bb "\ "WHERE bb.id_bibrec=%%s AND b.id=bb.id_bibxxx AND b.tag LIKE %%s"\ "ORDER BY bb.field_number, b.tag ASC" % (bx, bibx) res = run_sql(query, (recID, str(tag)+'%')) # go through fields: field_number_old = -999 field_old = "" for row in res: field, value, field_number = row[0], row[1], row[2] ind1, ind2 = field[3], field[4] if ind1 == "_": ind1 = "" if ind2 == "_": ind2 = "" # print field tag if field_number != field_number_old or field[:-1] != field_old[:-1]: if out: out += "\n" out += "%09d %s " % (recID, field[:5]) field_number_old = field_number field_old = field # print subfield value if field[0:2] == "00" and field[-1:] == "_": out += value else: out += "$$%s%s" % (field[-1:], value) return out def record_exists(recID): """Return 1 if record RECID exists. Return 0 if it doesn't exist. Return -1 if it exists but is marked as deleted.""" out = 0 res = run_sql("SELECT id FROM bibrec WHERE id=%s", (recID,), 1) if res: recID = int(recID) # record exists; now check whether it isn't marked as deleted: dbcollids = get_fieldvalues(recID, "980__%") if ("DELETED" in dbcollids) or (CFG_CERN_SITE and "DUMMY" in dbcollids): out = -1 # exists, but marked as deleted else: out = 1 # exists fine return out def record_public_p(recID): """Return 1 if the record is public, i.e. if it can be found in the Home collection. Return 0 otherwise. """ return recID in get_collection_reclist(CFG_SITE_NAME) def get_creation_date(recID, fmt="%Y-%m-%d"): "Returns the creation date of the record 'recID'." out = "" res = run_sql("SELECT DATE_FORMAT(creation_date,%s) FROM bibrec WHERE id=%s", (fmt, recID), 1) if res: out = res[0][0] return out def get_modification_date(recID, fmt="%Y-%m-%d"): "Returns the date of last modification for the record 'recID'." out = "" res = run_sql("SELECT DATE_FORMAT(modification_date,%s) FROM bibrec WHERE id=%s", (fmt, recID), 1) if res: out = res[0][0] return out def print_warning(req, msg, type='', prologue='<br />', epilogue='<br />'): "Prints warning message and flushes output." if req and msg: req.write(websearch_templates.tmpl_print_warning( msg = msg, type = type, prologue = prologue, epilogue = epilogue, )) return def print_search_info(p, f, sf, so, sp, rm, of, ot, collection=CFG_SITE_NAME, nb_found=-1, jrec=1, rg=10, aas=0, ln=CFG_SITE_LANG, p1="", p2="", p3="", f1="", f2="", f3="", m1="", m2="", m3="", op1="", op2="", sc=1, pl_in_url="", d1y=0, d1m=0, d1d=0, d2y=0, d2m=0, d2d=0, dt="", cpu_time=-1, middle_only=0): """Prints stripe with the information on 'collection' and 'nb_found' results and CPU time. Also, prints navigation links (beg/next/prev/end) inside the results set. If middle_only is set to 1, it will only print the middle box information (beg/netx/prev/end/etc) links. This is suitable for displaying navigation links at the bottom of the search results page.""" out = "" # sanity check: if jrec < 1: jrec = 1 if jrec > nb_found: jrec = max(nb_found-rg+1, 1) return websearch_templates.tmpl_print_search_info( ln = ln, collection = collection, aas = aas, collection_name = get_coll_i18nname(collection, ln, False), collection_id = get_colID(collection), middle_only = middle_only, rg = rg, nb_found = nb_found, sf = sf, so = so, rm = rm, of = of, ot = ot, p = p, f = f, p1 = p1, p2 = p2, p3 = p3, f1 = f1, f2 = f2, f3 = f3, m1 = m1, m2 = m2, m3 = m3, op1 = op1, op2 = op2, pl_in_url = pl_in_url, d1y = d1y, d1m = d1m, d1d = d1d, d2y = d2y, d2m = d2m, d2d = d2d, dt = dt, jrec = jrec, sc = sc, sp = sp, all_fieldcodes = get_fieldcodes(), cpu_time = cpu_time, ) def print_results_overview(req, colls, results_final_nb_total, results_final_nb, cpu_time, ln=CFG_SITE_LANG, ec=[]): """Prints results overview box with links to particular collections below.""" out = "" new_colls = [] for coll in colls: new_colls.append({ 'id': get_colID(coll), 'code': coll, 'name': get_coll_i18nname(coll, ln, False), }) return websearch_templates.tmpl_print_results_overview( ln = ln, results_final_nb_total = results_final_nb_total, results_final_nb = results_final_nb, cpu_time = cpu_time, colls = new_colls, ec = ec, ) def sort_records(req, recIDs, sort_field='', sort_order='d', sort_pattern='', verbose=0, of='hb', ln=CFG_SITE_LANG): """Sort records in 'recIDs' list according sort field 'sort_field' in order 'sort_order'. If more than one instance of 'sort_field' is found for a given record, try to choose that that is given by 'sort pattern', for example "sort by report number that starts by CERN-PS". Note that 'sort_field' can be field code like 'author' or MARC tag like '100__a' directly.""" _ = gettext_set_language(ln) ## check arguments: if not sort_field: return recIDs if len(recIDs) > CFG_WEBSEARCH_NB_RECORDS_TO_SORT: if of.startswith('h'): print_warning(req, _("Sorry, sorting is allowed on sets of up to %d records only. Using default sort order.") % CFG_WEBSEARCH_NB_RECORDS_TO_SORT, "Warning") return recIDs sort_fields = string.split(sort_field, ",") recIDs_dict = {} recIDs_out = [] ## first deduce sorting MARC tag out of the 'sort_field' argument: tags = [] for sort_field in sort_fields: if sort_field and str(sort_field[0:2]).isdigit(): # sort_field starts by two digits, so this is probably a MARC tag already tags.append(sort_field) else: # let us check the 'field' table query = """SELECT DISTINCT(t.value) FROM tag AS t, field_tag AS ft, field AS f WHERE f.code=%s AND ft.id_field=f.id AND t.id=ft.id_tag ORDER BY ft.score DESC""" res = run_sql(query, (sort_field, )) if res: for row in res: tags.append(row[0]) else: if of.startswith('h'): print_warning(req, _("Sorry, %s does not seem to be a valid sort option. Choosing title sort instead.") % cgi.escape(sort_field), "Error") tags.append("245__a") if verbose >= 3: print_warning(req, "Sorting by tags %s." % cgi.escape(repr(tags))) if sort_pattern: print_warning(req, "Sorting preferentially by %s." % cgi.escape(sort_pattern)) ## check if we have sorting tag defined: if tags: # fetch the necessary field values: for recID in recIDs: val = "" # will hold value for recID according to which sort vals = [] # will hold all values found in sorting tag for recID for tag in tags: vals.extend(get_fieldvalues(recID, tag)) if sort_pattern: # try to pick that tag value that corresponds to sort pattern bingo = 0 for v in vals: if v.lower().startswith(sort_pattern.lower()): # bingo! bingo = 1 val = v break if not bingo: # sort_pattern not present, so add other vals after spaces val = sort_pattern + " " + string.join(vals) else: # no sort pattern defined, so join them all together val = string.join(vals) val = strip_accents(val.lower()) # sort values regardless of accents and case if recIDs_dict.has_key(val): recIDs_dict[val].append(recID) else: recIDs_dict[val] = [recID] # sort them: recIDs_dict_keys = recIDs_dict.keys() recIDs_dict_keys.sort() # now that keys are sorted, create output array: for k in recIDs_dict_keys: for s in recIDs_dict[k]: recIDs_out.append(s) # ascending or descending? if sort_order == 'a': recIDs_out.reverse() # okay, we are done return recIDs_out else: # good, no sort needed return recIDs def print_records(req, recIDs, jrec=1, rg=10, format='hb', ot='', ln=CFG_SITE_LANG, relevances=[], relevances_prologue="(", relevances_epilogue="%%)", decompress=zlib.decompress, search_pattern='', print_records_prologue_p=True, print_records_epilogue_p=True, verbose=0, tab=''): """ Prints list of records 'recIDs' formatted according to 'format' in groups of 'rg' starting from 'jrec'. Assumes that the input list 'recIDs' is sorted in reverse order, so it counts records from tail to head. A value of 'rg=-9999' means to print all records: to be used with care. Print also list of RELEVANCES for each record (if defined), in between RELEVANCE_PROLOGUE and RELEVANCE_EPILOGUE. Print prologue and/or epilogue specific to 'format' if 'print_records_prologue_p' and/or print_records_epilogue_p' are True. """ # load the right message language _ = gettext_set_language(ln) # sanity checking: if req is None: return # get user_info (for formatting based on user) user_info = collect_user_info(req) if len(recIDs): nb_found = len(recIDs) if rg == -9999: # print all records rg = nb_found else: rg = abs(rg) if jrec < 1: # sanity checks jrec = 1 if jrec > nb_found: jrec = max(nb_found-rg+1, 1) # will print records from irec_max to irec_min excluded: irec_max = nb_found - jrec irec_min = nb_found - jrec - rg if irec_min < 0: irec_min = -1 if irec_max >= nb_found: irec_max = nb_found - 1 #req.write("%s:%d-%d" % (recIDs, irec_min, irec_max)) if format.startswith('x'): # print header if needed if print_records_prologue_p: print_records_prologue(req, format) # print records recIDs_to_print = [recIDs[x] for x in range(irec_max, irec_min, -1)] format_records(recIDs_to_print, format, ln=ln, search_pattern=search_pattern, record_separator="\n", user_info=user_info, req=req) # print footer if needed if print_records_epilogue_p: print_records_epilogue(req, format) elif format.startswith('t') or str(format[0:3]).isdigit(): # we are doing plain text output: for irec in range(irec_max, irec_min, -1): x = print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) req.write(x) if x: req.write('\n') elif format == 'excel': recIDs_to_print = [recIDs[x] for x in range(irec_max, irec_min, -1)] create_excel(recIDs=recIDs_to_print, req=req, ln=ln, ot=ot) else: # we are doing HTML output: if format == 'hp' or format.startswith("hb_") or format.startswith("hd_"): # portfolio and on-the-fly formats: for irec in range(irec_max, irec_min, -1): req.write(print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose)) elif format.startswith("hb"): # HTML brief format: req.write(websearch_templates.tmpl_record_format_htmlbrief_header( ln = ln)) for irec in range(irec_max, irec_min, -1): row_number = jrec+irec_max-irec recid = recIDs[irec] if relevances and relevances[irec]: relevance = relevances[irec] else: relevance = '' record = print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) req.write(websearch_templates.tmpl_record_format_htmlbrief_body( ln = ln, recid = recid, row_number = row_number, relevance = relevance, record = record, relevances_prologue = relevances_prologue, relevances_epilogue = relevances_epilogue, )) req.write(websearch_templates.tmpl_record_format_htmlbrief_footer( ln = ln)) elif format.startswith("hd"): # HTML detailed format: for irec in range(irec_max, irec_min, -1): unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(recIDs[irec])), recIDs[irec], ln=ln) ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()] ordered_tabs_id.sort(lambda x,y: cmp(x[1],y[1])) link_ln = '' if ln != CFG_SITE_LANG: link_ln = '?ln=%s' % ln if CFG_WEBSEARCH_USE_ALEPH_SYSNOS: recid_to_display = get_fieldvalues(recIDs[irec], CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG)[0] else: recid_to_display = recIDs[irec] tabs = [(unordered_tabs[tab_id]['label'], \ '%s/record/%s/%s%s' % (CFG_SITE_URL, recid_to_display, tab_id, link_ln), \ tab_id == tab, unordered_tabs[tab_id]['enabled']) \ for (tab_id, order) in ordered_tabs_id if unordered_tabs[tab_id]['visible'] == True] content = '' # load content if tab == 'usage': req.write(webstyle_templates.detailed_record_container_top(recIDs[irec], tabs, ln)) r = calculate_reading_similarity_list(recIDs[irec], "downloads") downloadsimilarity = None downloadhistory = None #if r: # downloadsimilarity = r if CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS: downloadhistory = create_download_history_graph_and_box(recIDs[irec], ln) r = calculate_reading_similarity_list(recIDs[irec], "pageviews") viewsimilarity = None if r: viewsimilarity = r content = websearch_templates.tmpl_detailed_record_statistics(recIDs[irec], ln, downloadsimilarity=downloadsimilarity, downloadhistory=downloadhistory, viewsimilarity=viewsimilarity) req.write(content) req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec], tabs, ln)) elif tab == 'citations': recid = recIDs[irec] req.write(webstyle_templates.detailed_record_container_top(recid, tabs, ln)) req.write(websearch_templates.tmpl_detailed_record_citations_prologue(recid, ln)) # Citing citinglist = [] r = calculate_cited_by_list(recid) if r: citinglist = r req.write(websearch_templates.tmpl_detailed_record_citations_citing_list(recid, ln, citinglist=citinglist)) # Self-cited selfcited = get_self_cited_by(recid) req.write(websearch_templates.tmpl_detailed_record_citations_self_cited(recid, ln, selfcited=selfcited, citinglist=citinglist)) # Co-cited s = calculate_co_cited_with_list(recid) cociting = None if s: cociting = s req.write(websearch_templates.tmpl_detailed_record_citations_co_citing(recid, ln, cociting=cociting)) # Citation history citationhistory = None if r: citationhistory = create_citation_history_graph_and_box(recid, ln) #debug if verbose > 3: print_warning(req, "Citation graph debug: "+str(len(citationhistory))) req.write(websearch_templates.tmpl_detailed_record_citations_citation_history(recid, ln, citationhistory)) req.write(websearch_templates.tmpl_detailed_record_citations_epilogue(recid, ln)) req.write(webstyle_templates.detailed_record_container_bottom(recid, tabs, ln)) elif tab == 'references': req.write(webstyle_templates.detailed_record_container_top(recIDs[irec], tabs, ln)) req.write(format_record(recIDs[irec], 'HDREF', ln=ln, user_info=user_info, verbose=verbose)) req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec], tabs, ln)) elif tab == 'holdings': req.write(webstyle_templates.detailed_record_container_top(recIDs[irec], tabs, ln)) req.write(format_record(recIDs[irec], 'HDHOLD', ln=ln, user_info=user_info, verbose=verbose)) req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec], tabs, ln)) else: # Metadata tab req.write(webstyle_templates.detailed_record_container_top(recIDs[irec], tabs, ln, show_short_rec_p=False)) creationdate = None modificationdate = None if record_exists(recIDs[irec]) == 1: creationdate = get_creation_date(recIDs[irec]) modificationdate = get_modification_date(recIDs[irec]) content = print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) content = websearch_templates.tmpl_detailed_record_metadata( recID = recIDs[irec], ln = ln, format = format, creationdate = creationdate, modificationdate = modificationdate, content = content) req.write(content) req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec], tabs, ln, creationdate=creationdate, modificationdate=modificationdate, show_short_rec_p=False)) if len(tabs) > 0: # Add the mini box at bottom of the page if CFG_WEBCOMMENT_ALLOW_REVIEWS: from invenio.webcomment import get_mini_reviews reviews = get_mini_reviews(recid = recIDs[irec], ln=ln) else: reviews = '' actions = format_record(recIDs[irec], 'HDACT', ln=ln, user_info=user_info, verbose=verbose) files = format_record(recIDs[irec], 'HDFILE', ln=ln, user_info=user_info, verbose=verbose) req.write(webstyle_templates.detailed_record_mini_panel(recIDs[irec], ln, format, files=files, reviews=reviews, actions=actions)) else: # Other formats for irec in range(irec_max, irec_min, -1): req.write(print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose)) else: print_warning(req, _("Use different search terms.")) def print_records_prologue(req, format): """ Print the appropriate prologue for list of records in the given format. """ prologue = "" # no prologue needed for HTML or Text formats if format.startswith('xm'): prologue = websearch_templates.tmpl_xml_marc_prologue() elif format.startswith('xn'): prologue = websearch_templates.tmpl_xml_nlm_prologue() elif format.startswith('xw'): prologue = websearch_templates.tmpl_xml_refworks_prologue() elif format.startswith('xr'): prologue = websearch_templates.tmpl_xml_rss_prologue() elif format.startswith('xe'): prologue = websearch_templates.tmpl_xml_endnote_prologue() elif format.startswith('xo'): prologue = websearch_templates.tmpl_xml_mods_prologue() elif format.startswith('x'): prologue = websearch_templates.tmpl_xml_default_prologue() req.write(prologue) def print_records_epilogue(req, format): """ Print the appropriate epilogue for list of records in the given format. """ epilogue = "" # no epilogue needed for HTML or Text formats if format.startswith('xm'): epilogue = websearch_templates.tmpl_xml_marc_epilogue() elif format.startswith('xn'): epilogue = websearch_templates.tmpl_xml_nlm_epilogue() elif format.startswith('xw'): epilogue = websearch_templates.tmpl_xml_refworks_epilogue() elif format.startswith('xr'): epilogue = websearch_templates.tmpl_xml_rss_epilogue() elif format.startswith('xe'): epilogue = websearch_templates.tmpl_xml_endnote_epilogue() elif format.startswith('xo'): epilogue = websearch_templates.tmpl_xml_mods_epilogue() elif format.startswith('x'): epilogue = websearch_templates.tmpl_xml_default_epilogue() req.write(epilogue) def get_record(recid): """Directly the record object corresponding to the recid.""" from marshal import loads, dumps from zlib import compress, decompress if CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE: value = run_sql('SELECT value FROM bibfmt WHERE id_bibrec=%s AND FORMAT=\'recstruct\'', (recid, )) if value: try: return loads(decompress(value[0][0])) except: ### In case of corruption, let's rebuild it! pass return create_record(print_record(recid, 'xm'))[0] def print_record(recID, format='hb', ot='', ln=CFG_SITE_LANG, decompress=zlib.decompress, search_pattern=None, user_info=None, verbose=0): """Prints record 'recID' formatted accoding to 'format'.""" if format == 'recstruct': return get_record(recID) _ = gettext_set_language(ln) out = "" # sanity check: record_exist_p = record_exists(recID) if record_exist_p == 0: # doesn't exist return out # New Python BibFormat procedure for formatting # Old procedure follows further below # We must still check some special formats, but these # should disappear when BibFormat improves. if not (CFG_BIBFORMAT_USE_OLD_BIBFORMAT \ or format.lower().startswith('t') \ or format.lower().startswith('hm') \ or str(format[0:3]).isdigit() \ or ot): # Unspecified format is hd if format == '': format = 'hd' if record_exist_p == -1 and get_output_format_content_type(format) == 'text/html': # HTML output displays a default value for deleted records. # Other format have to deal with it. out += _("The record has been deleted.") else: out += call_bibformat(recID, format, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) # at the end of HTML brief mode, print the "Detailed record" functionality: if format.lower().startswith('hb') and \ format.lower() != 'hb_p': out += websearch_templates.tmpl_print_record_brief_links( ln = ln, recID = recID, ) return out # Old PHP BibFormat procedure for formatting # print record opening tags, if needed: if format == "marcxml" or format == "oai_dc": out += " <record>\n" out += " <header>\n" for oai_id in get_fieldvalues(recID, CFG_OAI_ID_FIELD): out += " <identifier>%s</identifier>\n" % oai_id out += " <datestamp>%s</datestamp>\n" % get_modification_date(recID) out += " </header>\n" out += " <metadata>\n" if format.startswith("xm") or format == "marcxml": # look for detailed format existence: query = "SELECT value FROM bibfmt WHERE id_bibrec=%s AND format=%s" res = run_sql(query, (recID, format), 1) if res and record_exist_p == 1: # record 'recID' is formatted in 'format', so print it out += "%s" % decompress(res[0][0]) else: # record 'recID' is not formatted in 'format' -- they are not in "bibfmt" table; so fetch all the data from "bibXXx" tables: if format == "marcxml": out += """ <record xmlns="http://www.loc.gov/MARC21/slim">\n""" out += " <controlfield tag=\"001\">%d</controlfield>\n" % int(recID) elif format.startswith("xm"): out += """ <record>\n""" out += " <controlfield tag=\"001\">%d</controlfield>\n" % int(recID) if record_exist_p == -1: # deleted record, so display only OAI ID and 980: oai_ids = get_fieldvalues(recID, CFG_OAI_ID_FIELD) if oai_ids: out += "<datafield tag=\"%s\" ind1=\"%s\" ind2=\"%s\"><subfield code=\"%s\">%s</subfield></datafield>\n" % \ (CFG_OAI_ID_FIELD[0:3], CFG_OAI_ID_FIELD[3:4], CFG_OAI_ID_FIELD[4:5], CFG_OAI_ID_FIELD[5:6], oai_ids[0]) out += "<datafield tag=\"980\" ind1=\"\" ind2=\"\"><subfield code=\"c\">DELETED</subfield></datafield>\n" else: # controlfields query = "SELECT b.tag,b.value,bb.field_number FROM bib00x AS b, bibrec_bib00x AS bb "\ "WHERE bb.id_bibrec=%s AND b.id=bb.id_bibxxx AND b.tag LIKE '00%%' "\ "ORDER BY bb.field_number, b.tag ASC" res = run_sql(query, (recID, )) for row in res: field, value = row[0], row[1] value = encode_for_xml(value) out += """ <controlfield tag="%s" >%s</controlfield>\n""" % \ (encode_for_xml(field[0:3]), value) # datafields i = 1 # Do not process bib00x and bibrec_bib00x, as # they are controlfields. So start at bib01x and # bibrec_bib00x (and set i = 0 at the end of # first loop) for digit1 in range(0, 10): for digit2 in range(i, 10): bx = "bib%d%dx" % (digit1, digit2) bibx = "bibrec_bib%d%dx" % (digit1, digit2) query = "SELECT b.tag,b.value,bb.field_number FROM %s AS b, %s AS bb "\ "WHERE bb.id_bibrec=%%s AND b.id=bb.id_bibxxx AND b.tag LIKE %%s"\ "ORDER BY bb.field_number, b.tag ASC" % (bx, bibx) res = run_sql(query, (recID, str(digit1)+str(digit2)+'%')) field_number_old = -999 field_old = "" for row in res: field, value, field_number = row[0], row[1], row[2] ind1, ind2 = field[3], field[4] if ind1 == "_" or ind1 == "": ind1 = " " if ind2 == "_" or ind2 == "": ind2 = " " # print field tag if field_number != field_number_old or field[:-1] != field_old[:-1]: if field_number_old != -999: out += """ </datafield>\n""" out += """ <datafield tag="%s" ind1="%s" ind2="%s">\n""" % \ (encode_for_xml(field[0:3]), encode_for_xml(ind1), encode_for_xml(ind2)) field_number_old = field_number field_old = field # print subfield value value = encode_for_xml(value) out += """ <subfield code="%s">%s</subfield>\n""" % \ (encode_for_xml(field[-1:]), value) # all fields/subfields printed in this run, so close the tag: if field_number_old != -999: out += """ </datafield>\n""" i = 0 # Next loop should start looking at bib%0 and bibrec_bib00x # we are at the end of printing the record: out += " </record>\n" elif format == "xd" or format == "oai_dc": # XML Dublin Core format, possibly OAI -- select only some bibXXx fields: out += """ <dc xmlns="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://purl.org/dc/elements/1.1/ http://www.openarchives.org/OAI/1.1/dc.xsd">\n""" if record_exist_p == -1: out += "" else: for f in get_fieldvalues(recID, "041__a"): out += " <language>%s</language>\n" % f for f in get_fieldvalues(recID, "100__a"): out += " <creator>%s</creator>\n" % encode_for_xml(f) for f in get_fieldvalues(recID, "700__a"): out += " <creator>%s</creator>\n" % encode_for_xml(f) for f in get_fieldvalues(recID, "245__a"): out += " <title>%s</title>\n" % encode_for_xml(f) for f in get_fieldvalues(recID, "65017a"): out += " <subject>%s</subject>\n" % encode_for_xml(f) for f in get_fieldvalues(recID, "8564_u"): out += " <identifier>%s</identifier>\n" % encode_for_xml(f) for f in get_fieldvalues(recID, "520__a"): out += " <description>%s</description>\n" % encode_for_xml(f) out += " <date>%s</date>\n" % get_creation_date(recID) out += " </dc>\n" elif len(format) == 6 and str(format[0:3]).isdigit(): # user has asked to print some fields only if format == "001": out += "<!--%s-begin-->%s<!--%s-end-->\n" % (format, recID, format) else: vals = get_fieldvalues(recID, format) for val in vals: out += "<!--%s-begin-->%s<!--%s-end-->\n" % (format, val, format) elif format.startswith('t'): ## user directly asked for some tags to be displayed only if record_exist_p == -1: out += get_fieldvalues_alephseq_like(recID, ["001", CFG_OAI_ID_FIELD, "980"]) else: out += get_fieldvalues_alephseq_like(recID, ot) elif format == "hm": if record_exist_p == -1: out += "<pre>" + cgi.escape(get_fieldvalues_alephseq_like(recID, ["001", CFG_OAI_ID_FIELD, "980"])) + "</pre>" else: out += "<pre>" + cgi.escape(get_fieldvalues_alephseq_like(recID, ot)) + "</pre>" elif format.startswith("h") and ot: ## user directly asked for some tags to be displayed only if record_exist_p == -1: out += "<pre>" + get_fieldvalues_alephseq_like(recID, ["001", CFG_OAI_ID_FIELD, "980"]) + "</pre>" else: out += "<pre>" + get_fieldvalues_alephseq_like(recID, ot) + "</pre>" elif format == "hd": # HTML detailed format if record_exist_p == -1: out += _("The record has been deleted.") else: # look for detailed format existence: query = "SELECT value FROM bibfmt WHERE id_bibrec=%s AND format=%s" res = run_sql(query, (recID, format), 1) if res: # record 'recID' is formatted in 'format', so print it out += "%s" % decompress(res[0][0]) else: # record 'recID' is not formatted in 'format', so try to call BibFormat on the fly or use default format: out_record_in_format = call_bibformat(recID, format, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) if out_record_in_format: out += out_record_in_format else: out += websearch_templates.tmpl_print_record_detailed( ln = ln, recID = recID, ) elif format.startswith("hb_") or format.startswith("hd_"): # underscore means that HTML brief/detailed formats should be called on-the-fly; suitable for testing formats if record_exist_p == -1: out += _("The record has been deleted.") else: out += call_bibformat(recID, format, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) elif format.startswith("hx"): # BibTeX format, called on the fly: if record_exist_p == -1: out += _("The record has been deleted.") else: out += call_bibformat(recID, format, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) elif format.startswith("hs"): # for citation/download similarity navigation links: if record_exist_p == -1: out += _("The record has been deleted.") else: out += '<a href="%s">' % websearch_templates.build_search_url(recid=recID, ln=ln) # firstly, title: titles = get_fieldvalues(recID, "245__a") if titles: for title in titles: out += "<strong>%s</strong>" % title else: # usual title not found, try conference title: titles = get_fieldvalues(recID, "111__a") if titles: for title in titles: out += "<strong>%s</strong>" % title else: # just print record ID: out += "<strong>%s %d</strong>" % (get_field_i18nname("record ID", ln, False), recID) out += "</a>" # secondly, authors: authors = get_fieldvalues(recID, "100__a") + get_fieldvalues(recID, "700__a") if authors: out += " - %s" % authors[0] if len(authors) > 1: out += " <em>et al</em>" # thirdly publication info: publinfos = get_fieldvalues(recID, "773__s") if not publinfos: publinfos = get_fieldvalues(recID, "909C4s") if not publinfos: publinfos = get_fieldvalues(recID, "037__a") if not publinfos: publinfos = get_fieldvalues(recID, "088__a") if publinfos: out += " - %s" % publinfos[0] else: # fourthly publication year (if not publication info): years = get_fieldvalues(recID, "773__y") if not years: years = get_fieldvalues(recID, "909C4y") if not years: years = get_fieldvalues(recID, "260__c") if years: out += " (%s)" % years[0] else: # HTML brief format by default if record_exist_p == -1: out += _("The record has been deleted.") else: query = "SELECT value FROM bibfmt WHERE id_bibrec=%s AND format=%s" res = run_sql(query, (recID, format)) if res: # record 'recID' is formatted in 'format', so print it out += "%s" % decompress(res[0][0]) else: # record 'recID' is not formatted in 'format', so try to call BibFormat on the fly: or use default format: if CFG_WEBSEARCH_CALL_BIBFORMAT: out_record_in_format = call_bibformat(recID, format, ln, search_pattern=search_pattern, user_info=user_info, verbose=verbose) if out_record_in_format: out += out_record_in_format else: out += websearch_templates.tmpl_print_record_brief( ln = ln, recID = recID, ) else: out += websearch_templates.tmpl_print_record_brief( ln = ln, recID = recID, ) # at the end of HTML brief mode, print the "Detailed record" functionality: if format == 'hp' or format.startswith("hb_") or format.startswith("hd_"): pass # do nothing for portfolio and on-the-fly formats else: out += websearch_templates.tmpl_print_record_brief_links( ln = ln, recID = recID, ) # print record closing tags, if needed: if format == "marcxml" or format == "oai_dc": out += " </metadata>\n" out += " </record>\n" return out def call_bibformat(recID, format="HD", ln=CFG_SITE_LANG, search_pattern=None, user_info=None, verbose=0): """ Calls BibFormat and returns formatted record. BibFormat will decide by itself if old or new BibFormat must be used. """ keywords = [] if search_pattern is not None: units = create_basic_search_units(None, str(search_pattern), None) keywords = [unit[1] for unit in units if unit[0] != '-'] return format_record(recID, of=format, ln=ln, search_pattern=keywords, user_info=user_info, verbose=verbose) def log_query(hostname, query_args, uid=-1): """ Log query into the query and user_query tables. Return id_query or None in case of problems. """ id_query = None if uid >= 0: # log the query only if uid is reasonable res = run_sql("SELECT id FROM query WHERE urlargs=%s", (query_args,), 1) try: id_query = res[0][0] except: id_query = run_sql("INSERT INTO query (type, urlargs) VALUES ('r', %s)", (query_args,)) if id_query: run_sql("INSERT INTO user_query (id_user, id_query, hostname, date) VALUES (%s, %s, %s, %s)", (uid, id_query, hostname, time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))) return id_query def log_query_info(action, p, f, colls, nb_records_found_total=-1): """Write some info to the log file for later analysis.""" try: log = open(CFG_LOGDIR + "/search.log", "a") log.write(time.strftime("%Y%m%d%H%M%S#", time.localtime())) log.write(action+"#") log.write(p+"#") log.write(f+"#") for coll in colls[:-1]: log.write("%s," % coll) log.write("%s#" % colls[-1]) log.write("%d" % nb_records_found_total) log.write("\n") log.close() except: pass return def wash_url_argument(var, new_type): """Wash list argument into 'new_type', that can be 'list', 'str', or 'int'. Useful for washing mod_python passed arguments, that are all lists of strings (URL args may be multiple), but we sometimes want only to take the first value, and sometimes to represent it as string or numerical value.""" out = [] if new_type == 'list': # return lst if type(var) is list: out = var else: out = [var] elif new_type == 'str': # return str if type(var) is list: try: out = "%s" % var[0] except: out = "" elif type(var) is str: out = var else: out = "%s" % var elif new_type == 'int': # return int if type(var) is list: try: out = string.atoi(var[0]) except: out = 0 elif type(var) is int: out = var elif type(var) is str: try: out = string.atoi(var) except: out = 0 else: out = 0 return out ### CALLABLES def perform_request_search(req=None, cc=CFG_SITE_NAME, c=None, p="", f="", rg=10, sf="", so="d", sp="", rm="", of="id", ot="", aas=0, p1="", f1="", m1="", op1="", p2="", f2="", m2="", op2="", p3="", f3="", m3="", sc=0, jrec=0, recid=-1, recidb=-1, sysno="", id=-1, idb=-1, sysnb="", action="", d1="", d1y=0, d1m=0, d1d=0, d2="", d2y=0, d2m=0, d2d=0, dt="", verbose=0, ap=0, ln=CFG_SITE_LANG, ec=None, tab=""): """Perform search or browse request, without checking for authentication. Return list of recIDs found, if of=id. Otherwise create web page. The arguments are as follows: req - mod_python Request class instance. cc - current collection (e.g. "ATLAS"). The collection the user started to search/browse from. c - collection list (e.g. ["Theses", "Books"]). The collections user may have selected/deselected when starting to search from 'cc'. p - pattern to search for (e.g. "ellis and muon or kaon"). f - field to search within (e.g. "author"). rg - records in groups of (e.g. "10"). Defines how many hits per collection in the search results page are displayed. sf - sort field (e.g. "title"). so - sort order ("a"=ascending, "d"=descending). sp - sort pattern (e.g. "CERN-") -- in case there are more values in a sort field, this argument tells which one to prefer rm - ranking method (e.g. "jif"). Defines whether results should be ranked by some known ranking method. of - output format (e.g. "hb"). Usually starting "h" means HTML output (and "hb" for HTML brief, "hd" for HTML detailed), "x" means XML output, "t" means plain text output, "id" means no output at all but to return list of recIDs found. (Suitable for high-level API.) ot - output only these MARC tags (e.g. "100,700,909C0b"). Useful if only some fields are to be shown in the output, e.g. for library to control some fields. aas - advanced search ("0" means no, "1" means yes). Whether search was called from within the advanced search interface. p1 - first pattern to search for in the advanced search interface. Much like 'p'. f1 - first field to search within in the advanced search interface. Much like 'f'. m1 - first matching type in the advanced search interface. ("a" all of the words, "o" any of the words, "e" exact phrase, "p" partial phrase, "r" regular expression). op1 - first operator, to join the first and the second unit in the advanced search interface. ("a" add, "o" or, "n" not). p2 - second pattern to search for in the advanced search interface. Much like 'p'. f2 - second field to search within in the advanced search interface. Much like 'f'. m2 - second matching type in the advanced search interface. ("a" all of the words, "o" any of the words, "e" exact phrase, "p" partial phrase, "r" regular expression). op2 - second operator, to join the second and the third unit in the advanced search interface. ("a" add, "o" or, "n" not). p3 - third pattern to search for in the advanced search interface. Much like 'p'. f3 - third field to search within in the advanced search interface. Much like 'f'. m3 - third matching type in the advanced search interface. ("a" all of the words, "o" any of the words, "e" exact phrase, "p" partial phrase, "r" regular expression). sc - split by collection ("0" no, "1" yes). Governs whether we want to present the results in a single huge list, or splitted by collection. jrec - jump to record (e.g. "234"). Used for navigation inside the search results. recid - display record ID (e.g. "20000"). Do not search/browse but go straight away to the Detailed record page for the given recID. recidb - display record ID bis (e.g. "20010"). If greater than 'recid', then display records from recid to recidb. Useful for example for dumping records from the database for reformatting. sysno - display old system SYS number (e.g. ""). If you migrate to CDS Invenio from another system, and store your old SYS call numbers, you can use them instead of recid if you wish so. id - the same as recid, in case recid is not set. For backwards compatibility. idb - the same as recid, in case recidb is not set. For backwards compatibility. sysnb - the same as sysno, in case sysno is not set. For backwards compatibility. action - action to do. "SEARCH" for searching, "Browse" for browsing. Default is to search. d1 - first datetime in full YYYY-mm-dd HH:MM:DD format (e.g. "1998-08-23 12:34:56"). Useful for search limits on creation/modification date (see 'dt' argument below). Note that 'd1' takes precedence over d1y, d1m, d1d if these are defined. d1y - first date's year (e.g. "1998"). Useful for search limits on creation/modification date. d1m - first date's month (e.g. "08"). Useful for search limits on creation/modification date. d1d - first date's day (e.g. "23"). Useful for search limits on creation/modification date. d2 - second datetime in full YYYY-mm-dd HH:MM:DD format (e.g. "1998-09-02 12:34:56"). Useful for search limits on creation/modification date (see 'dt' argument below). Note that 'd2' takes precedence over d2y, d2m, d2d if these are defined. d2y - second date's year (e.g. "1998"). Useful for search limits on creation/modification date. d2m - second date's month (e.g. "09"). Useful for search limits on creation/modification date. d2d - second date's day (e.g. "02"). Useful for search limits on creation/modification date. dt - first and second date's type (e.g. "c"). Specifies whether to search in creation dates ("c") or in modification dates ("m"). When dt is not set and d1* and d2* are set, the default is "c". verbose - verbose level (0=min, 9=max). Useful to print some internal information on the searching process in case something goes wrong. ap - alternative patterns (0=no, 1=yes). In case no exact match is found, the search engine can try alternative patterns e.g. to replace non-alphanumeric characters by a boolean query. ap defines if this is wanted. ln - language of the search interface (e.g. "en"). Useful for internationalization. ec - list of external search engines to search as well (e.g. "SPIRES HEP"). """ selected_external_collections_infos = None # wash output format: of = wash_output_format(of) # for every search engine request asking for an HTML output, we # first regenerate cache of collection and field I18N names if # needed; so that later we won't bother checking timestamps for # I18N names at all: if of.startswith("h"): collection_i18nname_cache.recreate_cache_if_needed() field_i18nname_cache.recreate_cache_if_needed() # wash all arguments requiring special care try: (cc, colls_to_display, colls_to_search) = wash_colls(cc, c, sc) # which colls to search and to display? except InvenioWebSearchUnknownCollectionError, exc: colname = exc.colname if of.startswith("h"): page_start(req, of, cc, aas, ln, getUid(req), websearch_templates.tmpl_collection_not_found_page_title(colname, ln)) req.write(websearch_templates.tmpl_collection_not_found_page_body(colname, ln)) return page_end(req, of, ln) elif of == "id": return [] elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) else: return page_end(req, of, ln) p = wash_pattern(p) f = wash_field(f) p1 = wash_pattern(p1) f1 = wash_field(f1) p2 = wash_pattern(p2) f2 = wash_field(f2) p3 = wash_pattern(p3) f3 = wash_field(f3) datetext1, datetext2 = wash_dates(d1, d1y, d1m, d1d, d2, d2y, d2m, d2d) # wash ranking method: if not is_method_valid(None, rm): rm = "" _ = gettext_set_language(ln) # backwards compatibility: id, idb, sysnb -> recid, recidb, sysno (if applicable) if sysnb != "" and sysno == "": sysno = sysnb if id > 0 and recid == -1: recid = id if idb > 0 and recidb == -1: recidb = idb # TODO deduce passed search limiting criterias (if applicable) pl, pl_in_url = "", "" # no limits by default if action != "browse" and req and req.args: # we do not want to add options while browsing or while calling via command-line fieldargs = cgi.parse_qs(req.args) for fieldcode in get_fieldcodes(): if fieldargs.has_key(fieldcode): for val in fieldargs[fieldcode]: pl += "+%s:\"%s\" " % (fieldcode, val) pl_in_url += "&%s=%s" % (urllib.quote(fieldcode), urllib.quote(val)) # deduce recid from sysno argument (if applicable): if sysno: # ALEPH SYS number was passed, so deduce DB recID for the record: recid = get_mysql_recid_from_aleph_sysno(sysno) if recid is None: recid = 0 # use recid 0 to indicate that this sysno does not exist # deduce collection we are in (if applicable): if recid > 0: referer = None if req: referer = req.headers_in.get('Referer') cc = guess_collection_of_a_record(recid, referer) # deduce user id (if applicable): try: uid = getUid(req) except: uid = 0 ## 0 - start output if recid >= 0: # recid can be 0 if deduced from sysno and if such sysno does not exist ## 1 - detailed record display title, description, keywords = \ websearch_templates.tmpl_record_page_header_content(req, recid, ln) if req is not None and not req.header_only: page_start(req, of, cc, aas, ln, uid, title, description, keywords, recid, tab) # Default format is hb but we are in detailed -> change 'of' if of == "hb": of = "hd" if record_exists(recid): if recidb <= recid: # sanity check recidb = recid + 1 if of == "id": return [recidx for recidx in range(recid, recidb) if record_exists(recidx)] else: print_records(req, range(recid, recidb), -1, -9999, of, ot, ln, search_pattern=p, verbose=verbose, tab=tab) if req and of.startswith("h"): # register detailed record page view event - client_ip_address = str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + client_ip_address = str(req.remote_ip) register_page_view_event(recid, uid, client_ip_address) else: # record does not exist if of == "id": return [] elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) elif of.startswith("h"): if req.header_only: raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND else: print_warning(req, _("Requested record does not seem to exist.")) elif action == "browse": ## 2 - browse needed of = 'hb' page_start(req, of, cc, aas, ln, uid, _("Browse"), p=create_page_title_search_pattern_info(p, p1, p2, p3)) req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1, p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action)) try: if aas == 1 or (p1 or p2 or p3): browse_pattern(req, colls_to_search, p1, f1, rg, ln) browse_pattern(req, colls_to_search, p2, f2, rg, ln) browse_pattern(req, colls_to_search, p3, f3, rg, ln) else: browse_pattern(req, colls_to_search, p, f, rg, ln) except: register_exception(req=req, alert_admin=True) if of.startswith("h"): req.write(create_error_box(req, verbose=verbose, ln=ln)) elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) elif rm and p.startswith("recid:"): ## 3-ter - similarity search or citation search needed if not req.header_only: page_start(req, of, cc, aas, ln, uid, _("Search Results"), p=create_page_title_search_pattern_info(p, p1, p2, p3)) if of.startswith("h"): req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1, p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action)) if record_exists(p[6:]) != 1: # record does not exist if of.startswith("h"): if req.header_only: raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND else: print_warning(req, "Requested record does not seem to exist.") if of == "id": return [] elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) else: # record well exists, so find similar ones to it t1 = os.times()[4] results_similar_recIDs, results_similar_relevances, results_similar_relevances_prologue, results_similar_relevances_epilogue, results_similar_comments = \ rank_records(rm, 0, get_collection_reclist(cc), string.split(p), verbose) if results_similar_recIDs: t2 = os.times()[4] cpu_time = t2 - t1 if of.startswith("h"): req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, cc, len(results_similar_recIDs), jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2, sc, pl_in_url, d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time)) print_warning(req, results_similar_comments) print_records(req, results_similar_recIDs, jrec, rg, of, ot, ln, results_similar_relevances, results_similar_relevances_prologue, results_similar_relevances_epilogue, search_pattern=p, verbose=verbose) elif of=="id": return results_similar_recIDs elif of.startswith("x"): print_records(req, results_similar_recIDs, jrec, rg, of, ot, ln, results_similar_relevances, results_similar_relevances_prologue, results_similar_relevances_epilogue, search_pattern=p, verbose=verbose) else: # rank_records failed and returned some error message to display: if of.startswith("h"): print_warning(req, results_similar_relevances_prologue) print_warning(req, results_similar_relevances_epilogue) print_warning(req, results_similar_comments) if of == "id": return [] elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) elif p.startswith("cocitedwith:"): #WAS EXPERIMENTAL ## 3-terter - cited by search needed page_start(req, of, cc, aas, ln, uid, _("Search Results"), p=create_page_title_search_pattern_info(p, p1, p2, p3)) if of.startswith("h"): req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1, p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action)) recID = p[12:] if record_exists(recID) != 1: # record does not exist if of.startswith("h"): print_warning(req, "Requested record does not seem to exist.") if of == "id": return [] elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) else: # record well exists, so find co-cited ones: t1 = os.times()[4] results_cocited_recIDs = map(lambda x: x[0], calculate_co_cited_with_list(int(recID))) if results_cocited_recIDs: t2 = os.times()[4] cpu_time = t2 - t1 if of.startswith("h"): req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, CFG_SITE_NAME, len(results_cocited_recIDs), jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2, sc, pl_in_url, d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time)) print_records(req, results_cocited_recIDs, jrec, rg, of, ot, ln, search_pattern=p, verbose=verbose) elif of=="id": return results_cocited_recIDs elif of.startswith("x"): print_records(req, results_cocited_recIDs, jrec, rg, of, ot, ln, search_pattern=p, verbose=verbose) else: # cited rank_records failed and returned some error message to display: if of.startswith("h"): print_warning(req, "nothing found") if of == "id": return [] elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) else: ## 3 - common search needed query_in_cache = False query_representation_in_cache = repr((p,f,colls_to_search)) page_start(req, of, cc, aas, ln, uid, p=create_page_title_search_pattern_info(p, p1, p2, p3)) if of.startswith("h"): req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1, p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action)) t1 = os.times()[4] results_in_any_collection = HitSet() if aas == 1 or (p1 or p2 or p3): ## 3A - advanced search try: results_in_any_collection = search_pattern_parenthesised(req, p1, f1, m1, ap=ap, of=of, verbose=verbose, ln=ln) if len(results_in_any_collection) == 0: if of.startswith("h"): perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) if p2: results_tmp = search_pattern_parenthesised(req, p2, f2, m2, ap=ap, of=of, verbose=verbose, ln=ln) if op1 == "a": # add results_in_any_collection.intersection_update(results_tmp) elif op1 == "o": # or results_in_any_collection.union_update(results_tmp) elif op1 == "n": # not results_in_any_collection.difference_update(results_tmp) else: if of.startswith("h"): print_warning(req, "Invalid set operation %s." % cgi.escape(op1), "Error") if len(results_in_any_collection) == 0: if of.startswith("h"): perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) if p3: results_tmp = search_pattern_parenthesised(req, p3, f3, m3, ap=ap, of=of, verbose=verbose, ln=ln) if op2 == "a": # add results_in_any_collection.intersection_update(results_tmp) elif op2 == "o": # or results_in_any_collection.union_update(results_tmp) elif op2 == "n": # not results_in_any_collection.difference_update(results_tmp) else: if of.startswith("h"): print_warning(req, "Invalid set operation %s." % cgi.escape(op2), "Error") except: register_exception(req=req, alert_admin=True) if of.startswith("h"): req.write(create_error_box(req, verbose=verbose, ln=ln)) perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) else: ## 3B - simple search if search_results_cache.cache.has_key(query_representation_in_cache): # query is not in the cache already, so reuse it: query_in_cache = True results_in_any_collection = search_results_cache.cache[query_representation_in_cache] if verbose and of.startswith("h"): print_warning(req, "Search stage 0: query found in cache, reusing cached results.") else: try: results_in_any_collection = search_pattern_parenthesised(req, p, f, ap=ap, of=of, verbose=verbose, ln=ln) except: register_exception(req=req, alert_admin=True) if of.startswith("h"): req.write(create_error_box(req, verbose=verbose, ln=ln)) perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) return page_end(req, of, ln) if len(results_in_any_collection) == 0: if of.startswith("h"): perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) # store this search query results into search results cache if needed: if CFG_WEBSEARCH_SEARCH_CACHE_SIZE and not query_in_cache: if len(search_results_cache.cache) > CFG_WEBSEARCH_SEARCH_CACHE_SIZE: search_results_cache.clear() search_results_cache.cache[query_representation_in_cache] = results_in_any_collection if verbose and of.startswith("h"): print_warning(req, "Search stage 3: storing query results in cache.") # search stage 4: intersection with collection universe: try: results_final = intersect_results_with_collrecs(req, results_in_any_collection, colls_to_search, ap, of, verbose, ln) except: register_exception(req=req, alert_admin=True) if of.startswith("h"): req.write(create_error_box(req, verbose=verbose, ln=ln)) perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) return page_end(req, of, ln) if results_final == {}: if of.startswith("h"): perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) if of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) # search stage 5: apply search option limits and restrictions: if datetext1 != "": if verbose and of.startswith("h"): print_warning(req, "Search stage 5: applying time etc limits, from %s until %s..." % (datetext1, datetext2)) try: results_final = intersect_results_with_hitset(req, results_final, search_unit_in_bibrec(datetext1, datetext2, dt), ap, aptext= _("No match within your time limits, " "discarding this condition..."), of=of) except: register_exception(req=req, alert_admin=True) if of.startswith("h"): req.write(create_error_box(req, verbose=verbose, ln=ln)) perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) return page_end(req, of, ln) if results_final == {}: if of.startswith("h"): perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) return page_end(req, of, ln) if pl: pl = wash_pattern(pl) if verbose and of.startswith("h"): print_warning(req, "Search stage 5: applying search pattern limit %s..." % cgi.escape(pl)) try: results_final = intersect_results_with_hitset(req, results_final, search_pattern_parenthesised(req, pl, ap=0, ln=ln), ap, aptext=_("No match within your search limits, " "discarding this condition..."), of=of) except: register_exception(req=req, alert_admin=True) if of.startswith("h"): req.write(create_error_box(req, verbose=verbose, ln=ln)) perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) return page_end(req, of, ln) if results_final == {}: if of.startswith("h"): perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) if of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) return page_end(req, of, ln) t2 = os.times()[4] cpu_time = t2 - t1 ## search stage 6: display results: results_final_nb_total = 0 results_final_nb = {} # will hold number of records found in each collection # (in simple dict to display overview more easily) for coll in results_final.keys(): results_final_nb[coll] = len(results_final[coll]) #results_final_nb_total += results_final_nb[coll] # Now let us calculate results_final_nb_total more precisely, # in order to get the total number of "distinct" hits across # searched collections; this is useful because a record might # have been attributed to more than one primary collection; so # we have to avoid counting it multiple times. The price to # pay for this accuracy of results_final_nb_total is somewhat # increased CPU time. if results_final.keys() == 1: # only one collection; no need to union them results_final_for_all_selected_colls = results_final.values()[0] results_final_nb_total = results_final_nb.values()[0] else: # okay, some work ahead to union hits across collections: results_final_for_all_selected_colls = HitSet() for coll in results_final.keys(): results_final_for_all_selected_colls.union_update(results_final[coll]) results_final_nb_total = len(results_final_for_all_selected_colls) if results_final_nb_total == 0: if of.startswith('h'): print_warning(req, "No match found, please enter different search terms.") elif of.startswith("x"): # Print empty, but valid XML print_records_prologue(req, of) print_records_epilogue(req, of) else: # yes, some hits found: good! # collection list may have changed due to not-exact-match-found policy so check it out: for coll in results_final.keys(): if coll not in colls_to_search: colls_to_search.append(coll) # print results overview: if of == "id": # we have been asked to return list of recIDs recIDs = list(results_final_for_all_selected_colls) if sf: # do we have to sort? recIDs = sort_records(req, recIDs, sf, so, sp, verbose, of) elif rm: # do we have to rank? results_final_for_all_colls_rank_records_output = rank_records(rm, 0, results_final_for_all_selected_colls, string.split(p) + string.split(p1) + string.split(p2) + string.split(p3), verbose) if results_final_for_all_colls_rank_records_output[0]: recIDs = results_final_for_all_colls_rank_records_output[0] return recIDs elif of.startswith("h"): if of not in ['hcs']: req.write(print_results_overview(req, colls_to_search, results_final_nb_total, results_final_nb, cpu_time, ln, ec)) selected_external_collections_infos = print_external_results_overview(req, cc, [p, p1, p2, p3], f, ec, verbose, ln) # print number of hits found for XML outputs: if of.startswith("x"): req.write("<!-- Search-Engine-Total-Number-Of-Results: %s -->\n" % results_final_nb_total) # print records: if of in ['hcs']: # feed the current search to be summarized: from invenio.search_engine_summarizer import summarize_records summarize_records(results_final_for_all_selected_colls, 'hcs', ln, p, f, req) else: if len(colls_to_search)>1: cpu_time = -1 # we do not want to have search time printed on each collection print_records_prologue(req, of) for coll in colls_to_search: if results_final.has_key(coll) and len(results_final[coll]): if of.startswith("h"): req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, coll, results_final_nb[coll], jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2, sc, pl_in_url, d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time)) results_final_recIDs = list(results_final[coll]) results_final_relevances = [] results_final_relevances_prologue = "" results_final_relevances_epilogue = "" if sf: # do we have to sort? results_final_recIDs = sort_records(req, results_final_recIDs, sf, so, sp, verbose, of) elif rm: # do we have to rank? results_final_recIDs_ranked, results_final_relevances, results_final_relevances_prologue, results_final_relevances_epilogue, results_final_comments = \ rank_records(rm, 0, results_final[coll], string.split(p) + string.split(p1) + string.split(p2) + string.split(p3), verbose) if of.startswith("h"): print_warning(req, results_final_comments) if results_final_recIDs_ranked: results_final_recIDs = results_final_recIDs_ranked else: # rank_records failed and returned some error message to display: print_warning(req, results_final_relevances_prologue) print_warning(req, results_final_relevances_epilogue) print_records(req, results_final_recIDs, jrec, rg, of, ot, ln, results_final_relevances, results_final_relevances_prologue, results_final_relevances_epilogue, search_pattern=p, print_records_prologue_p=False, print_records_epilogue_p=False, verbose=verbose) if of.startswith("h"): req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, coll, results_final_nb[coll], jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2, sc, pl_in_url, d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, 1)) print_records_epilogue(req, of) if f == "author" and of.startswith("h"): req.write(create_similarly_named_authors_link_box(p, ln)) # log query: try: - id_query = log_query(req.get_remote_host(), req.args, uid) + id_query = log_query(req.remote_host, req.args, uid) if of.startswith("h") and id_query: if not of in ['hcs']: # display alert/RSS teaser for non-summary formats: req.write(websearch_templates.tmpl_alert_rss_teaser_box_for_query(id_query, ln=ln)) except: # do not log query if req is None (used by CLI interface) pass log_query_info("ss", p, f, colls_to_search, results_final_nb_total) # External searches if of.startswith("h"): if not of in ['hcs']: perform_external_collection_search(req, cc, [p, p1, p2, p3], f, ec, verbose, ln, selected_external_collections_infos) return page_end(req, of, ln) def perform_request_cache(req, action="show"): """Manipulates the search engine cache.""" req.content_type = "text/html" req.send_http_header() req.write("<html>") out = "" out += "<h1>Search Cache</h1>" # clear cache if requested: if action == "clear": search_results_cache.clear() req.write(out) # show collection reclist cache: out = "<h3>Collection reclist cache</h3>" out += "- collection table last updated: %s" % get_table_update_time('collection') out += "<br />- reclist cache timestamp: %s" % collection_reclist_cache.timestamp out += "<br />- reclist cache contents:" out += "<blockquote>" for coll in collection_reclist_cache.cache.keys(): if collection_reclist_cache.cache[coll]: out += "%s (%d)<br />" % (coll, len(collection_reclist_cache.cache[coll])) out += "</blockquote>" req.write(out) # show search results cache: out = "<h3>Search Cache</h3>" out += "- search cache usage: %d queries cached (max. ~%d)" % \ (len(search_results_cache.cache), CFG_WEBSEARCH_SEARCH_CACHE_SIZE) if len(search_results_cache.cache): out += "<br />- search cache contents:" out += "<blockquote>" for query, hitset in search_results_cache.cache.items(): out += "<br />%s ... %s" % (query, hitset) out += """<p><a href="%s/search/cache?action=clear">clear search results cache</a>""" % CFG_SITE_URL out += "</blockquote>" req.write(out) # show field i18nname cache: out = "<h3>Field I18N names cache</h3>" out += "- fieldname table last updated: %s" % get_table_update_time('fieldname') out += "<br />- i18nname cache timestamp: %s" % field_i18nname_cache.timestamp out += "<br />- i18nname cache contents:" out += "<blockquote>" for field in field_i18nname_cache.cache.keys(): for ln in field_i18nname_cache.cache[field].keys(): out += "%s, %s = %s<br />" % (field, ln, field_i18nname_cache.cache[field][ln]) out += "</blockquote>" req.write(out) # show collection i18nname cache: out = "<h3>Collection I18N names cache</h3>" out += "- collectionname table last updated: %s" % get_table_update_time('collectionname') out += "<br />- i18nname cache timestamp: %s" % collection_i18nname_cache.timestamp out += "<br />- i18nname cache contents:" out += "<blockquote>" for coll in collection_i18nname_cache.cache.keys(): for ln in collection_i18nname_cache.cache[coll].keys(): out += "%s, %s = %s<br />" % (coll, ln, collection_i18nname_cache.cache[coll][ln]) out += "</blockquote>" req.write(out) req.write("</html>") return "\n" def perform_request_log(req, date=""): """Display search log information for given date.""" req.content_type = "text/html" req.send_http_header() req.write("<html>") req.write("<h1>Search Log</h1>") if date: # case A: display stats for a day yyyymmdd = string.atoi(date) req.write("<p><big><strong>Date: %d</strong></big><p>" % yyyymmdd) req.write("""<table border="1">""") req.write("<tr><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td></tr>" % ("No.", "Time", "Pattern", "Field", "Collection", "Number of Hits")) # read file: p = os.popen("grep ^%d %s/search.log" % (yyyymmdd, CFG_LOGDIR), 'r') lines = p.readlines() p.close() # process lines: i = 0 for line in lines: try: datetime, aas, p, f, c, nbhits = string.split(line,"#") i += 1 req.write("<tr><td align=\"right\">#%d</td><td>%s:%s:%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>" \ % (i, datetime[8:10], datetime[10:12], datetime[12:], p, f, c, nbhits)) except: pass # ignore eventual wrong log lines req.write("</table>") else: # case B: display summary stats per day yyyymm01 = int(time.strftime("%Y%m01", time.localtime())) yyyymmdd = int(time.strftime("%Y%m%d", time.localtime())) req.write("""<table border="1">""") req.write("<tr><td><strong>%s</strong></td><td><strong>%s</strong></tr>" % ("Day", "Number of Queries")) for day in range(yyyymm01, yyyymmdd + 1): p = os.popen("grep -c ^%d %s/search.log" % (day, CFG_LOGDIR), 'r') for line in p.readlines(): req.write("""<tr><td>%s</td><td align="right"><a href="%s/search/log?date=%d">%s</a></td></tr>""" % \ (day, CFG_SITE_URL, day, line)) p.close() req.write("</table>") req.write("</html>") return "\n" def get_most_popular_field_values(recids, tags, exclude_values=None, count_repetitive_values=True): """ Analyze RECIDS and look for TAGS and return most popular values and the frequency with which they occur sorted according to descending frequency. If a value is found in EXCLUDE_VALUES, then do not count it. If COUNT_REPETITIVE_VALUES is True, then we count every occurrence of value in the tags. If False, then we count the value only once regardless of the number of times it may appear in a record. (But, if the same value occurs in another record, we count it, of course.) Example: >>> get_most_popular_field_values(range(11,20), '980__a') (('PREPRINT', 10), ('THESIS', 7), ...) >>> get_most_popular_field_values(range(11,20), ('100__a', '700__a')) (('Ellis, J', 10), ('Ellis, N', 7), ...) >>> get_most_popular_field_values(range(11,20), ('100__a', '700__a'), ('Ellis, J')) (('Ellis, N', 7), ...) """ def _get_most_popular_field_values_helper_sorter(val1, val2): "Compare VAL1 and VAL2 according to, firstly, frequency, then secondly, alphabetically." compared_via_frequencies = cmp(valuefreqdict[val2], valuefreqdict[val1]) if compared_via_frequencies == 0: return cmp(val1.lower(), val2.lower()) else: return compared_via_frequencies valuefreqdict = {} ## sanity check: if not exclude_values: exclude_values = [] if isinstance(tags, str): tags = (tags,) ## find values to count: vals_to_count = [] if count_repetitive_values: # counting technique A: can look up many records at once: (very fast) for tag in tags: vals_to_count.extend(get_fieldvalues(recids, tag)) else: # counting technique B: must count record-by-record: (slow) for recid in recids: vals_in_rec = [] for tag in tags: for val in get_fieldvalues(recid, tag, False): vals_in_rec.append(val) # do not count repetitive values within this record # (even across various tags, so need to unify again): dtmp = {} for val in vals_in_rec: dtmp[val] = 1 vals_in_rec = dtmp.keys() vals_to_count.extend(vals_in_rec) ## are we to exclude some of found values? for val in vals_to_count: if val not in exclude_values: if valuefreqdict.has_key(val): valuefreqdict[val] += 1 else: valuefreqdict[val] = 1 ## sort by descending frequency of values: out = () vals = valuefreqdict.keys() vals.sort(_get_most_popular_field_values_helper_sorter) for val in vals: out += (val, valuefreqdict[val]), return out def profile(p="", f="", c=CFG_SITE_NAME): """Profile search time.""" import profile import pstats profile.run("perform_request_search(p='%s',f='%s', c='%s')" % (p, f, c), "perform_request_search_profile") p = pstats.Stats("perform_request_search_profile") p.strip_dirs().sort_stats("cumulative").print_stats() return 0 ## test cases: #print wash_colls(CFG_SITE_NAME,"Library Catalogue", 0) #print wash_colls("Periodicals & Progress Reports",["Periodicals","Progress Reports"], 0) #print wash_field("wau") #print print_record(20,"tm","001,245") #print create_opft_search_units(None, "PHE-87-13","reportnumber") #print ":"+wash_pattern("* and % doo * %")+":\n" #print ":"+wash_pattern("*")+":\n" #print ":"+wash_pattern("ellis* ell* e*%")+":\n" #print run_sql("SELECT name,dbquery from collection") #print get_index_id("author") #print get_coll_ancestors("Theses") #print get_coll_sons("Articles & Preprints") #print get_coll_real_descendants("Articles & Preprints") #print get_collection_reclist("Theses") #print log(sys.stdin) #print search_unit_in_bibrec('2002-12-01','2002-12-12') #print type(wash_url_argument("-1",'int')) #print get_nearest_terms_in_bibxxx("ellis", "author", 5, 5) #print call_bibformat(68, "HB_FLY") #print get_fieldvalues(10, "980__a") #print get_fieldvalues_alephseq_like(10,"001___") #print get_fieldvalues_alephseq_like(10,"980__a") #print get_fieldvalues_alephseq_like(10,"foo") #print get_fieldvalues_alephseq_like(10,"-1") #print get_fieldvalues_alephseq_like(10,"99") #print get_fieldvalues_alephseq_like(10,["001", "980"]) ## profiling: #profile("of the this") #print perform_request_search(p="ellis") diff --git a/modules/websearch/lib/websearch_webinterface.py b/modules/websearch/lib/websearch_webinterface.py index 47501688d..84ed177af 100644 --- a/modules/websearch/lib/websearch_webinterface.py +++ b/modules/websearch/lib/websearch_webinterface.py @@ -1,1204 +1,1201 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """WebSearch URL handler.""" __revision__ = "$Id$" import cgi import os import datetime import time import sys from urllib import quote -try: - from mod_python import apache -except ImportError: - pass +from invenio import webinterface_handler_wsgi_utils as apache #maximum number of collaborating authors etc shown in GUI MAX_COLLAB_LIST = 10 MAX_KEYWORD_LIST = 10 MAX_VENUE_LIST = 10 #tag constants AUTHOR_TAG = "100__a" COAUTHOR_TAG = "700__a" AUTHOR_INST_TAG = "100__u" VENUE_TAG = "909C4p" KEYWORD_TAG = "6531_a" if sys.hexversion < 0x2040000: # pylint: disable-msg=W0622 from sets import Set as set # pylint: enable-msg=W0622 from invenio.config import \ CFG_SITE_URL, \ CFG_SITE_NAME, \ CFG_CACHEDIR, \ CFG_SITE_LANG, \ CFG_SITE_ADMIN_EMAIL, \ CFG_SITE_SECURE_URL, \ CFG_WEBSEARCH_INSTANT_BROWSE_RSS, \ CFG_WEBSEARCH_RSS_TTL, \ CFG_WEBSEARCH_RSS_MAX_CACHED_REQUESTS, \ CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, \ CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES, \ CFG_WEBDIR, \ CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS, \ CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS, \ CFG_WEBSEARCH_PERMITTED_RESTRICTED_COLLECTIONS_LEVEL, \ CFG_WEBSEARCH_USE_ALEPH_SYSNOS from invenio.dbquery import Error from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory from invenio.urlutils import redirect_to_url, make_canonical_urlargd, drop_default_urlargd, create_html_link from invenio.webuser import getUid, page_not_authorized, get_user_preferences, \ collect_user_info, http_check_credentials, logoutUser, isUserSuperAdmin, \ session_param_get from invenio import search_engine from invenio.websubmit_webinterface import WebInterfaceFilesPages from invenio.bibclassify_webinterface import WebInterfaceKeywordsPages from invenio.webcomment_webinterface import WebInterfaceCommentsPages from invenio.bibcirculation_webinterface import WebInterfaceHoldingsPages from invenio.webpage import page, create_error_box from invenio.messages import gettext_set_language from invenio.search_engine import get_colID, get_coll_i18nname, \ check_user_can_view_record, collection_restricted_p, restricted_collection_cache, \ get_fieldvalues, get_most_popular_field_values, get_mysql_recid_from_aleph_sysno from invenio.access_control_engine import acc_authorize_action from invenio.access_control_config import VIEWRESTRCOLL from invenio.access_control_mailcookie import mail_cookie_create_authorize_action from invenio.bibformat import format_records from invenio.bibformat_engine import get_output_formats from invenio.websearch_webcoll import mymkdir, get_collection from invenio.intbitset import intbitset from invenio.bibupload import find_record_from_sysno from invenio.bibrank_citation_searcher import get_author_cited_by, get_cited_by_list from invenio.bibrank_downloads_indexer import get_download_weight_total from invenio.search_engine_summarizer import summarize_records from invenio.errorlib import register_exception from invenio.bibedit_webinterface import WebInterfaceEditPages from invenio.bibeditmulti_webinterface import WebInterfaceMultiEditPages from invenio.bibmerge_webinterface import WebInterfaceMergePages import invenio.template websearch_templates = invenio.template.load('websearch') search_results_default_urlargd = websearch_templates.search_results_default_urlargd search_interface_default_urlargd = websearch_templates.search_interface_default_urlargd try: output_formats = [output_format['attrs']['code'].lower() for output_format in \ get_output_formats(with_attributes=True).values()] except KeyError: output_formats = ['xd', 'xm', 'hd', 'hb', 'hs', 'hx'] output_formats.extend(['hm', 't', 'h']) def wash_search_urlargd(form): """ Create canonical search arguments from those passed via web form. """ argd = wash_urlargd(form, search_results_default_urlargd) if argd.has_key('as'): argd['aas'] = argd['as'] del argd['as'] # Sometimes, users pass ot=245,700 instead of # ot=245&ot=700. Normalize that. ots = [] for ot in argd['ot']: ots += ot.split(',') argd['ot'] = ots # We can either get the mode of function as # action=<browse|search>, or by setting action_browse or # action_search. if argd['action_browse']: argd['action'] = 'browse' elif argd['action_search']: argd['action'] = 'search' else: if argd['action'] not in ('browse', 'search'): argd['action'] = 'search' del argd['action_browse'] del argd['action_search'] return argd class WebInterfaceUnAPIPages(WebInterfaceDirectory): """ Handle /unapi set of pages.""" _exports = [''] def __call__(self, req, form): argd = wash_urlargd(form, { 'id' : (int, 0), 'format' : (str, '')}) formats_dict = get_output_formats(True) formats = {} for format in formats_dict.values(): if format['attrs']['visibility']: formats[format['attrs']['code'].lower()] = format['attrs']['content_type'] del formats_dict if argd['id'] and argd['format']: ## Translate back common format names format = { 'nlm' : 'xn', 'marcxml' : 'xm', 'dc' : 'xd', 'endnote' : 'xe', 'mods' : 'xo' }.get(argd['format'], argd['format']) if format in formats: redirect_to_url(req, '%s/record/%s/export/%s' % (CFG_SITE_URL, argd['id'], format)) else: raise apache.SERVER_RETURN, apache.HTTP_NOT_ACCEPTABLE elif argd['id']: return websearch_templates.tmpl_unapi(formats, identifier=argd['id']) else: return websearch_templates.tmpl_unapi(formats) index = __call__ class WebInterfaceAuthorPages(WebInterfaceDirectory): """ Handle /author/Doe%2C+John etc set of pages.""" _exports = ['author'] def __init__(self, authorname=''): """Constructor.""" self.authorname = authorname def _lookup(self, component, path): """This handler parses dynamic URLs (/author/John+Doe).""" return WebInterfaceAuthorPages(component), path def __call__(self, req, form): """Serve the page in the given language.""" argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG), 'verbose': (int, 0) }) ln = argd['ln'] verbose = argd['verbose'] req.argd = argd #needed since perform_req_search # start page req.content_type = "text/html" req.send_http_header() uid = getUid(req) search_engine.page_start(req, "hb", "", "", ln, uid) #wants to check it in case of no results self.authorname = self.authorname.replace("+"," ") if not self.authorname: return websearch_templates.tmpl_author_information(req, {}, self.authorname, 0, {}, {}, {}, {}, {}, ln) #let's see what takes time.. time1 = time.time() genstart = time1 citelist = get_author_cited_by(self.authorname) time2 = time.time() if verbose == 9: req.write("<br/>citelist generation took: "+str(time2-time1)+"<br/>") #search the publications by this author pubs = search_engine.perform_request_search(req=req, p=self.authorname, f="author") #get most frequent authors of these pubs popular_author_tuples = search_engine.get_most_popular_field_values(pubs, (AUTHOR_TAG, COAUTHOR_TAG)) authors= [] for (auth, frequency) in popular_author_tuples: if len(authors) < MAX_COLLAB_LIST: authors.append(auth) time1 = time.time() if verbose == 9: req.write("<br/>popularized authors: "+str(time1-time2)+"<br/>") #and publication venues venuetuples = search_engine.get_most_popular_field_values(pubs, (VENUE_TAG)) time2 = time.time() if verbose == 9: req.write("<br/>venues: "+str(time2-time1)+"<br/>") #and keywords kwtuples = search_engine.get_most_popular_field_values(pubs, (KEYWORD_TAG)) time1 = time.time() if verbose == 9: req.write("<br/>keywords: "+str(time1-time2)+"<br/>") #construct a simple list of tuples that contains keywords that appear more than once #moreover, limit the length of the list to MAX_KEYWORD_LIST kwtuples = kwtuples[0:MAX_KEYWORD_LIST] vtuples = venuetuples[0:MAX_VENUE_LIST] #remove the author in question from authors: they are associates if (authors.count(self.authorname) > 0): authors.remove(self.authorname) authors = authors[0:MAX_COLLAB_LIST] #cut extra time2 = time.time() if verbose == 9: req.write("<br/>misc: "+str(time2-time1)+"<br/>") #a dict. keys: affiliations, values: lists of publications author_aff_pubs = self.get_institute_pub_dict(pubs) authoraffs = author_aff_pubs.keys() time1 = time.time() if verbose == 9: req.write("<br/>affiliations: "+str(time1-time2)+"<br/>") #find out how many times these records have been downloaded recsloads = {} recsloads = get_download_weight_total(recsloads, pubs) #sum up totaldownloads = 0 for k in recsloads.keys(): totaldownloads = totaldownloads + recsloads[k] #get cited by.. citedbylist = get_cited_by_list(pubs) time1 = time.time() if verbose == 9: req.write("<br/>citedby: "+str(time1-time2)+"<br/>") #finally all stuff there, call the template websearch_templates.tmpl_author_information(req, pubs, self.authorname, totaldownloads, author_aff_pubs, citedbylist, kwtuples, authors, vtuples, ln) time1 = time.time() #cited-by summary out = summarize_records(intbitset(pubs), 'hcs', ln, self.authorname, 'author', req) time2 = time.time() if verbose == 9: req.write("<br/>summarizer: "+str(time2-time1)+"<br/>") req.write(out) simauthbox = search_engine.create_similarly_named_authors_link_box(self.authorname) req.write(simauthbox) if verbose == 9: req.write("<br/>all: "+str(time.time()-genstart)+"<br/>") return search_engine.page_end(req, 'hb', ln) def get_institute_pub_dict(self, recids): #return a dictionary consisting of institute -> list of publications affus = [] #list of insts from the record author_aff_pubs = {} #the disct to be build for recid in recids: #iterate all so that we get first author's intitute #if this the first author OR #"his" institute if he is an affliate author mainauthors = get_fieldvalues(recid, AUTHOR_TAG) mainauthor = " " if mainauthors: mainauthor = mainauthors[0] if (mainauthor == self.authorname): affus = get_fieldvalues(recid, AUTHOR_INST_TAG) #if this is empty, add a dummy " " value if (affus == []): affus = [" "] for a in affus: #add in author_aff_pubs if (author_aff_pubs.has_key(a)): tmp = author_aff_pubs[a] tmp.append(recid) author_aff_pubs[a] = tmp else: author_aff_pubs[a] = [recid] return author_aff_pubs index = __call__ class WebInterfaceRecordPages(WebInterfaceDirectory): """ Handling of a /record/<recid> URL fragment """ _exports = ['', 'files', 'reviews', 'comments', 'usage', 'references', 'export', 'citations', 'holdings', 'edit', 'keywords', 'multiedit', 'merge'] #_exports.extend(output_formats) def __init__(self, recid, tab, format=None): self.recid = recid self.tab = tab self.format = format self.export = self self.files = WebInterfaceFilesPages(self.recid) self.reviews = WebInterfaceCommentsPages(self.recid, reviews=1) self.comments = WebInterfaceCommentsPages(self.recid) self.usage = self self.references = self self.holdings = WebInterfaceHoldingsPages(self.recid) self.keywords = WebInterfaceKeywordsPages(self.recid) self.citations = self self.export = WebInterfaceRecordExport(self.recid, self.format) self.edit = WebInterfaceEditPages(self.recid) self.merge = WebInterfaceMergePages(self.recid) return def __call__(self, req, form): argd = wash_search_urlargd(form) argd['recid'] = self.recid argd['tab'] = self.tab if self.format is not None: argd['of'] = self.format req.argd = argd uid = getUid(req) if uid == -1: return page_not_authorized(req, "../", text="You are not authorized to view this record.", navmenuid='search') elif uid > 0: pref = get_user_preferences(uid) try: if not form.has_key('rg'): # fetch user rg preference only if not overridden via URL argd['rg'] = int(pref['websearch_group_records']) except (KeyError, ValueError): pass user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid) if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and not isUserSuperAdmin(user_info): argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : search_engine.guess_primary_collection_of_a_record(self.recid)}) target = CFG_SITE_SECURE_URL + '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_URL + req.unparsed_uri}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg,\ navmenuid='search') # mod_python does not like to return [] in case when of=id: out = search_engine.perform_request_search(req, **argd) if out == []: return str(out) else: return out # Return the same page wether we ask for /record/123 or /record/123/ index = __call__ class WebInterfaceRecordRestrictedPages(WebInterfaceDirectory): """ Handling of a /record-restricted/<recid> URL fragment """ _exports = ['', 'files', 'reviews', 'comments', 'usage', 'references', 'export', 'citations', 'holdings', 'edit', 'keywords', 'multiedit', 'merge'] #_exports.extend(output_formats) def __init__(self, recid, tab, format=None): self.recid = recid self.tab = tab self.format = format self.files = WebInterfaceFilesPages(self.recid) self.reviews = WebInterfaceCommentsPages(self.recid, reviews=1) self.comments = WebInterfaceCommentsPages(self.recid) self.usage = self self.references = self self.keywords = WebInterfaceKeywordsPages(self.recid) self.holdings = WebInterfaceHoldingsPages(self.recid) self.citations = self self.export = WebInterfaceRecordExport(self.recid, self.format) self.edit = WebInterfaceEditPages(self.recid) self.merge = WebInterfaceMergePages(self.recid) return def __call__(self, req, form): argd = wash_search_urlargd(form) argd['recid'] = self.recid if self.format is not None: argd['of'] = self.format req.argd = argd uid = getUid(req) user_info = collect_user_info(req) if uid == -1: return page_not_authorized(req, "../", text="You are not authorized to view this record.", navmenuid='search') elif uid > 0: pref = get_user_preferences(uid) try: if not form.has_key('rg'): # fetch user rg preference only if not overridden via URL argd['rg'] = int(pref['websearch_group_records']) except (KeyError, ValueError): pass if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and not isUserSuperAdmin(user_info): argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS record_primary_collection = search_engine.guess_primary_collection_of_a_record(self.recid) if collection_restricted_p(record_primary_collection): (auth_code, dummy) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=record_primary_collection) if auth_code: return page_not_authorized(req, "../", text="You are not authorized to view this record.", navmenuid='search') # Keep all the arguments, they might be reused in the # record page itself to derivate other queries req.argd = argd # mod_python does not like to return [] in case when of=id: out = search_engine.perform_request_search(req, **argd) if out == []: return str(out) else: return out # Return the same page wether we ask for /record/123 or /record/123/ index = __call__ class WebInterfaceSearchResultsPages(WebInterfaceDirectory): """ Handling of the /search URL and its sub-pages. """ _exports = ['', 'authenticate', 'cache', 'log'] def __call__(self, req, form): """ Perform a search. """ argd = wash_search_urlargd(form) _ = gettext_set_language(argd['ln']) if req.method == 'POST': raise apache.SERVER_RETURN, apache.HTTP_METHOD_NOT_ALLOWED uid = getUid(req) user_info = collect_user_info(req) if uid == -1: return page_not_authorized(req, "../", text = _("You are not authorized to view this area."), navmenuid='search') elif uid > 0: pref = get_user_preferences(uid) try: if not form.has_key('rg'): # fetch user rg preference only if not overridden via URL argd['rg'] = int(pref['websearch_group_records']) except (KeyError, ValueError): pass if CFG_WEBSEARCH_PERMITTED_RESTRICTED_COLLECTIONS_LEVEL == 2: ## Let's update the current collections list with all ## the restricted collections the user has rights to view. try: restricted_collections = user_info['precached_permitted_restricted_collections'] argd_collections = set(argd['c']) argd_collections.update(restricted_collections) argd['c'] = list(argd_collections) except KeyError: pass if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and not isUserSuperAdmin(user_info): argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS involved_collections = set() involved_collections.update(argd['c']) involved_collections.add(argd['cc']) if argd['id'] > 0: argd['recid'] = argd['id'] if argd['idb'] > 0: argd['recidb'] = argd['idb'] if argd['sysno']: tmp_recid = find_record_from_sysno(argd['sysno']) if tmp_recid: argd['recid'] = tmp_recid if argd['sysnb']: tmp_recid = find_record_from_sysno(argd['sysnb']) if tmp_recid: argd['recidb'] = tmp_recid if argd['recid'] > 0: if argd['recidb'] > argd['recid']: # Hack to check if among the restricted collections # at least a record of the range is there and # then if the user is not authorized for that # collection. recids = intbitset(xrange(argd['recid'], argd['recidb'])) restricted_collection_cache.recreate_cache_if_needed() for collname in restricted_collection_cache.cache: (auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=collname) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: coll_recids = get_collection(collname).reclist if coll_recids & recids: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : collname}) target = CFG_SITE_SECURE_URL + '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_URL + req.unparsed_uri}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg,\ navmenuid='search') else: involved_collections.add(search_engine.guess_primary_collection_of_a_record(argd['recid'])) # If any of the collection requires authentication, redirect # to the authentication form. for coll in involved_collections: if collection_restricted_p(coll): (auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=coll) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : coll}) target = CFG_SITE_SECURE_URL + '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_URL + req.unparsed_uri}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg,\ navmenuid='search') # Keep all the arguments, they might be reused in the # search_engine itself to derivate other queries req.argd = argd # mod_python does not like to return [] in case when of=id: out = search_engine.perform_request_search(req, **argd) if out == []: return str(out) else: return out def cache(self, req, form): """Search cache page.""" argd = wash_urlargd(form, {'action': (str, 'show')}) return search_engine.perform_request_cache(req, action=argd['action']) def log(self, req, form): """Search log page.""" argd = wash_urlargd(form, {'date': (str, '')}) return search_engine.perform_request_log(req, date=argd['date']) def authenticate(self, req, form): """Restricted search results pages.""" argd = wash_search_urlargd(form) user_info = collect_user_info(req) for coll in argd['c'] + [argd['cc']]: if collection_restricted_p(coll): (auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=coll) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : coll}) target = CFG_SITE_SECURE_URL + '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_URL + req.unparsed_uri}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg,\ navmenuid='search') # Keep all the arguments, they might be reused in the # search_engine itself to derivate other queries req.argd = argd uid = getUid(req) if uid > 0: pref = get_user_preferences(uid) try: if not form.has_key('rg'): # fetch user rg preference only if not overridden via URL argd['rg'] = int(pref['websearch_group_records']) except (KeyError, ValueError): pass # mod_python does not like to return [] in case when of=id: out = search_engine.perform_request_search(req, **argd) if out == []: return str(out) else: return out index = __call__ class WebInterfaceLegacySearchPages(WebInterfaceDirectory): """ Handling of the /search.py URL and its sub-pages. """ _exports = ['', ('authenticate', 'index')] def __call__(self, req, form): """ Perform a search. """ argd = wash_search_urlargd(form) # We either jump into the generic search form, or the specific # /record/... display if a recid is requested if argd['recid'] != -1: target = '/record/%d' % argd['recid'] del argd['recid'] else: target = '/search' target += make_canonical_urlargd(argd, search_results_default_urlargd) return redirect_to_url(req, target, apache.HTTP_MOVED_PERMANENTLY) index = __call__ # Parameters for the legacy URLs, of the form /?c=ALEPH legacy_collection_default_urlargd = { 'as': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE), 'aas': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE), 'verbose': (int, 0), 'c': (str, CFG_SITE_NAME)} class WebInterfaceSearchInterfacePages(WebInterfaceDirectory): """ Handling of collection navigation.""" _exports = [('index.py', 'legacy_collection'), ('', 'legacy_collection'), ('search.py', 'legacy_search'), 'search', 'openurl', 'testsso', 'logout_SSO_hook'] search = WebInterfaceSearchResultsPages() legacy_search = WebInterfaceLegacySearchPages() def testsso(self, req, form): """ For testing single sign-on """ req.add_common_vars() sso_env = {} for var, value in req.subprocess_env.iteritems(): if var.startswith('HTTP_ADFS_'): sso_env[var] = value out = "<HTML><HEAD><TITLE>SSO test</TITLE</HEAD>" out += "<BODY><TABLE>" for var, value in sso_env.iteritems(): out += "<TR><TD><STRONG>%s</STRONG></TD><TD>%s</TD></TR>" % (var, value) out += "</TABLE></BODY></HTML>" return out def logout_SSO_hook(self, req, form): """Script triggered by the display of the centralized SSO logout dialog. It logouts the user from CDS Invenio and stream back the expected picture.""" logoutUser(req) req.content_type = 'image/gif' req.encoding = None req.filename = 'wsignout.gif' req.headers_out["Content-Disposition"] = "inline; filename=wsignout.gif" req.set_content_length(os.path.getsize('%s/img/wsignout.gif' % CFG_WEBDIR)) req.send_http_header() req.sendfile('%s/img/wsignout.gif' % CFG_WEBDIR) def _lookup(self, component, path): """ This handler is invoked for the dynamic URLs (for collections and records)""" if component == 'collection': c = '/'.join(path) def answer(req, form): """Accessing collections cached pages.""" # Accessing collections: this is for accessing the # cached page on top of each collection. argd = wash_urlargd(form, search_interface_default_urlargd) # We simply return the cached page of the collection argd['c'] = c if not argd['c']: # collection argument not present; display # home collection by default argd['c'] = CFG_SITE_NAME # Treat `as' argument specially: if argd.has_key('as'): argd['aas'] = argd['as'] del argd['as'] return display_collection(req, **argd) return answer, [] elif component == 'record' and path and path[0] == 'merge': return WebInterfaceMergePages(), path[1:] elif component == 'record' and path and path[0] == 'edit': return WebInterfaceEditPages(), path[1:] elif component == 'record' and path[0] == 'multiedit': return WebInterfaceMultiEditPages(), path[1:] elif component == 'record' or component == 'record-restricted': try: if CFG_WEBSEARCH_USE_ALEPH_SYSNOS: # let us try to recognize /record/<SYSNO> style of URLs: x = get_mysql_recid_from_aleph_sysno(path[0]) if x: recid = x else: recid = int(path[0]) else: recid = int(path[0]) except IndexError: # display record #1 for URL /record without a number recid = 1 except ValueError: if path[0] == '': # display record #1 for URL /record/ without a number recid = 1 else: # display page not found for URLs like /record/foo return None, [] if recid <= 0: # display page not found for URLs like /record/-5 or /record/0 return None, [] format = None tab = '' try: if path[1] in ['', 'files', 'reviews', 'comments','usage', 'references', 'citations', 'holdings', 'edit', 'keywords', 'multiedit', 'merge']: tab = path[1] elif path[1] == 'export': tab = '' format = path[2] # format = None # elif path[1] in output_formats: # tab = '' # format = path[1] else: # display page not found for URLs like /record/references # for a collection where 'references' tabs is not visible return None, [] except IndexError: # Keep normal url if tabs is not specified pass #if component == 'record-restricted': #return WebInterfaceRecordRestrictedPages(recid, tab, format), path[1:] #else: return WebInterfaceRecordPages(recid, tab, format), path[1:] return None, [] def openurl(self, req, form): """ OpenURL Handler.""" argd = wash_urlargd(form, websearch_templates.tmpl_openurl_accepted_args) ret_url = websearch_templates.tmpl_openurl2invenio(argd) if ret_url: return redirect_to_url(req, ret_url) else: return redirect_to_url(req, CFG_SITE_URL) def legacy_collection(self, req, form): """Collection URL backward compatibility handling.""" accepted_args = dict(legacy_collection_default_urlargd) accepted_args.update({'referer' : (str, ''), 'realm' : (str, '')}) argd = wash_urlargd(form, accepted_args) # Apache authentication stuff if argd['realm']: http_check_credentials(req, argd['realm']) return redirect_to_url(req, argd['referer'] or '%s/youraccount/youradminactivities' % CFG_SITE_SECURE_URL) del argd['referer'] del argd['realm'] # Treat `as' argument specially: if argd.has_key('as'): argd['aas'] = argd['as'] del argd['as'] # If we specify no collection, then we don't need to redirect # the user, so that accessing <http://yoursite/> returns the # default collection. if not form.has_key('c'): return display_collection(req, **argd) # make the collection an element of the path, and keep the # other query elements as is. If the collection is CFG_SITE_NAME, # however, redirect to the main URL. c = argd['c'] del argd['c'] if c == CFG_SITE_NAME: target = '/' else: target = '/collection/' + quote(c) # Treat `as' argument specially: # We are going to redirect, so replace `aas' by `as' visible argument: if argd.has_key('aas'): argd['as'] = argd['aas'] del argd['aas'] target += make_canonical_urlargd(argd, legacy_collection_default_urlargd) return redirect_to_url(req, target) def display_collection(req, c, aas, verbose, ln): """Display search interface page for collection c by looking in the collection cache.""" _ = gettext_set_language(ln) req.argd = drop_default_urlargd({'aas': aas, 'verbose': verbose, 'ln': ln}, search_interface_default_urlargd) # get user ID: try: uid = getUid(req) user_preferences = {} if uid == -1: return page_not_authorized(req, "../", text="You are not authorized to view this collection", navmenuid='search') elif uid > 0: user_preferences = get_user_preferences(uid) except Error: register_exception(req=req, alert_admin=True) return page(title=_("Internal Error"), body = create_error_box(req, verbose=verbose, ln=ln), description="%s - Internal Error" % CFG_SITE_NAME, keywords="%s, Internal Error" % CFG_SITE_NAME, language=ln, req=req, navmenuid='search') # start display: req.content_type = "text/html" req.send_http_header() # deduce collection id: colID = get_colID(c) if type(colID) is not int: page_body = '<p>' + (_("Sorry, collection %s does not seem to exist.") % ('<strong>' + str(c) + '</strong>')) + '</p>' page_body = '<p>' + (_("You may want to start browsing from %s.") % ('<a href="' + CFG_SITE_URL + '?ln=' + ln + '">' + get_coll_i18nname(CFG_SITE_NAME, ln) + '</a>')) + '</p>' if req.header_only: raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND return page(title=_("Collection %s Not Found") % cgi.escape(c), body=page_body, description=(CFG_SITE_NAME + ' - ' + _("Not found") + ': ' + cgi.escape(str(c))), keywords="%s" % CFG_SITE_NAME, uid=uid, language=ln, req=req, navmenuid='search') # wash `aas' argument: if not os.path.exists("%s/collections/%d/body-as=%d-ln=%s.html" % \ (CFG_CACHEDIR, colID, aas, ln)): # nonexistent `aas' asked for, fall back to Simple Search: aas = 0 # display collection interface page: try: filedesc = open("%s/collections/%d/navtrail-as=%d-ln=%s.html" % \ (CFG_CACHEDIR, colID, aas, ln), "r") c_navtrail = filedesc.read() filedesc.close() except: c_navtrail = "" try: filedesc = open("%s/collections/%d/body-as=%d-ln=%s.html" % \ (CFG_CACHEDIR, colID, aas, ln), "r") c_body = filedesc.read() filedesc.close() except: c_body = "" try: filedesc = open("%s/collections/%d/portalbox-tp-ln=%s.html" % \ (CFG_CACHEDIR, colID, ln), "r") c_portalbox_tp = filedesc.read() filedesc.close() except: c_portalbox_tp = "" try: filedesc = open("%s/collections/%d/portalbox-te-ln=%s.html" % \ (CFG_CACHEDIR, colID, ln), "r") c_portalbox_te = filedesc.read() filedesc.close() except: c_portalbox_te = "" try: filedesc = open("%s/collections/%d/portalbox-lt-ln=%s.html" % \ (CFG_CACHEDIR, colID, ln), "r") c_portalbox_lt = filedesc.read() filedesc.close() except: c_portalbox_lt = "" try: # show help boxes (usually located in "tr", "top right") # if users have not banned them in their preferences: c_portalbox_rt = "" if user_preferences.get('websearch_helpbox', 1) > 0: filedesc = open("%s/collections/%d/portalbox-rt-ln=%s.html" % \ (CFG_CACHEDIR, colID, ln), "r") c_portalbox_rt = filedesc.read() filedesc.close() except: c_portalbox_rt = "" try: filedesc = open("%s/collections/%d/last-updated-ln=%s.html" % \ (CFG_CACHEDIR, colID, ln), "r") c_last_updated = filedesc.read() filedesc.close() except: c_last_updated = "" try: title = get_coll_i18nname(c, ln) # if there is only one collection defined, do not print its # title on the page as it would be displayed repetitively. if len(search_engine.collection_reclist_cache.cache.keys()) == 1: title = "" except: title = "" # RSS: rssurl = CFG_SITE_URL + '/rss' if c != CFG_SITE_NAME: rssurl += '?cc=' + quote(c) if 'hb' in CFG_WEBSEARCH_USE_JSMATH_FOR_FORMATS: metaheaderadd = """ <script type='text/javascript'> jsMath = { Controls: {cookie: {printwarn: 0}} }; </script> <script src='/jsMath/easy/invenio-jsmath.js' type='text/javascript'></script> """ else: metaheaderadd = '' return page(title=title, body=c_body, navtrail=c_navtrail, description="%s - %s" % (CFG_SITE_NAME, c), keywords="%s, %s" % (CFG_SITE_NAME, c), metaheaderadd=metaheaderadd, uid=uid, language=ln, req=req, cdspageboxlefttopadd=c_portalbox_lt, cdspageboxrighttopadd=c_portalbox_rt, titleprologue=c_portalbox_tp, titleepilogue=c_portalbox_te, lastupdated=c_last_updated, navmenuid='search', rssurl=rssurl, show_title_p=-1 not in CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES) class WebInterfaceRSSFeedServicePages(WebInterfaceDirectory): """RSS 2.0 feed service pages.""" def __call__(self, req, form): """RSS 2.0 feed service.""" # Keep only interesting parameters for the search default_params = websearch_templates.rss_default_urlargd # We need to keep 'jrec' and 'rg' here in order to have # 'multi-page' RSS. These parameters are not kept be default # as we don't want to consider them when building RSS links # from search and browse pages. default_params.update({'jrec':(int, 1), 'rg': (int, CFG_WEBSEARCH_INSTANT_BROWSE_RSS)}) argd = wash_urlargd(form, default_params) for coll in argd['c'] + [argd['cc']]: if collection_restricted_p(coll): user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=coll) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : coll}) target = CFG_SITE_SECURE_URL + '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_URL + req.unparsed_uri}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg,\ navmenuid='search') # Create a standard filename with these parameters current_url = websearch_templates.build_rss_url(argd) cache_filename = current_url.split('/')[-1] # In the same way as previously, add 'jrec' & 'rg' req.content_type = "application/rss+xml" req.send_http_header() try: # Try to read from cache path = "%s/rss/%s.xml" % (CFG_CACHEDIR, cache_filename) # Check if cache needs refresh filedesc = open(path, "r") last_update_time = datetime.datetime.fromtimestamp(os.stat(os.path.abspath(path)).st_mtime) assert(datetime.datetime.now() < last_update_time + datetime.timedelta(minutes=CFG_WEBSEARCH_RSS_TTL)) c_rss = filedesc.read() filedesc.close() req.write(c_rss) return except Exception, e: # do it live and cache previous_url = None if argd['jrec'] > 1: prev_jrec = argd['jrec'] - argd['rg'] if prev_jrec < 1: prev_jrec = 1 previous_url = websearch_templates.build_rss_url(argd, jrec=prev_jrec) recIDs = search_engine.perform_request_search(req, of="id", c=argd['c'], cc=argd['cc'], p=argd['p'], f=argd['f'], p1=argd['p1'], f1=argd['f1'], m1=argd['m1'], op1=argd['op1'], p2=argd['p2'], f2=argd['f2'], m2=argd['m2'], op2=argd['op2'], p3=argd['p3'], f3=argd['f3'], m3=argd['m3']) next_url = None if len(recIDs) >= argd['jrec'] + argd['rg']: next_url = websearch_templates.build_rss_url(argd, jrec=(argd['jrec'] + argd['rg'])) recIDs = recIDs[-argd['jrec']:(-argd['rg']-argd['jrec']):-1] rss_prologue = '<?xml version="1.0" encoding="UTF-8"?>\n' + \ websearch_templates.tmpl_xml_rss_prologue(current_url=current_url, previous_url=previous_url, next_url=next_url) + '\n' req.write(rss_prologue) rss_body = format_records(recIDs, of='xr', record_separator="\n", req=req, epilogue="\n") rss_epilogue = websearch_templates.tmpl_xml_rss_epilogue() + '\n' req.write(rss_epilogue) # update cache dirname = "%s/rss" % (CFG_CACHEDIR) mymkdir(dirname) fullfilename = "%s/rss/%s.xml" % (CFG_CACHEDIR, cache_filename) try: # Remove the file just in case it already existed # so that a bit of space is created os.remove(fullfilename) except OSError: pass # Check if there's enough space to cache the request. if len(os.listdir(dirname)) < CFG_WEBSEARCH_RSS_MAX_CACHED_REQUESTS: try: os.umask(022) f = open(fullfilename, "w") f.write(rss_prologue + rss_body + rss_epilogue) f.close() except IOError, v: if v[0] == 36: # URL was too long. Never mind, don't cache pass else: raise repr(v) index = __call__ class WebInterfaceRecordExport(WebInterfaceDirectory): """ Handling of a /record/<recid>/export/<format> URL fragment """ _exports = output_formats def __init__(self, recid, format=None): self.recid = recid self.format = format for output_format in output_formats: self.__dict__[output_format] = self return def __call__(self, req, form): argd = wash_search_urlargd(form) argd['recid'] = self.recid if self.format is not None: argd['of'] = self.format req.argd = argd uid = getUid(req) if uid == -1: return page_not_authorized(req, "../", text="You are not authorized to view this record.", navmenuid='search') elif uid > 0: pref = get_user_preferences(uid) try: if not form.has_key('rg'): # fetch user rg preference only if not overridden via URL argd['rg'] = int(pref['websearch_group_records']) except (KeyError, ValueError): pass # Check if the record belongs to a restricted primary # collection. If yes, redirect to the authenticated URL. user_info = collect_user_info(req) (auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid) if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and not isUserSuperAdmin(user_info): argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : search_engine.guess_primary_collection_of_a_record(self.recid)}) target = CFG_SITE_SECURE_URL + '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_URL + req.unparsed_uri}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg,\ navmenuid='search') # mod_python does not like to return [] in case when of=id: out = search_engine.perform_request_search(req, **argd) if out == []: return str(out) else: return out # Return the same page wether we ask for /record/123/export/xm or /record/123/export/xm/ index = __call__ diff --git a/modules/websession/lib/session.py b/modules/websession/lib/session.py index 81fd36944..c996d2204 100644 --- a/modules/websession/lib/session.py +++ b/modules/websession/lib/session.py @@ -1,521 +1,488 @@ # -*- coding: utf-8 -*- ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Session management adapted from mod_python Session class. Just use L{get_session} to obtain a session object (with a dictionary interface, which will let you store permanent information). """ -try: - from mod_python.Cookie import add_cookie, Cookie, get_cookie - from mod_python.apache import APLOG_NOTICE, AP_MPMQ_IS_THREADED, \ - AP_MPMQ_MAX_SPARE_THREADS, _apache - CFG_IN_APACHE = True -except ImportError: - CFG_IN_APACHE = False +from invenio.webinterface_handler_wsgi_utils import add_cookie, Cookie, get_cookie import cPickle import time import random import re import sys import os +from thread import get_ident if sys.hexversion < 0x2060000: from md5 import md5 else: from hashlib import md5 from invenio.dbquery import run_sql, blob_to_string from invenio.config import CFG_WEBSESSION_EXPIRY_LIMIT_REMEMBER, \ CFG_WEBSESSION_EXPIRY_LIMIT_DEFAULT from invenio.websession_config import CFG_WEBSESSION_COOKIE_NAME, \ CFG_WEBSESSION_ONE_DAY, CFG_WEBSESSION_CLEANUP_CHANCE, \ CFG_WEBSESSION_ENABLE_LOCKING def get_session(req, sid=None): """ Obtain a session. If the session has already been created for the current request, returns the already existing session. @param req: the mod_python request object. @type req: mod_python request object @param sid: the session identifier of an already existing session. @type sid: 32 hexadecimal string @return: the session. @rtype: InvenioSession @raise ValueError: if C{sid} is provided and it doesn't correspond to a valid session. @note: if the session has already been retrieved once within the current request handling, the same session object will be returned and, if provided, the C{sid} parameter will be ignored. """ if not hasattr(req, '_session'): req._session = InvenioSession(req, sid) return req._session class InvenioSession(dict): """ This class implements a Session handling based on MySQL. @param req: the mod_python request object. @type req: mod_python request object @param sid: the session identifier if already known @type sid: 32 hexadecimal string @ivar _remember_me: if the session cookie should last one day or until the browser is closed. @type _remember_me: bool @note: The code is heavily based on ModPython 3.3.1 DBMSession implementation. @note: This class implements IP verification to prevent basic cookie stealing. @raise ValueError: if C{sid} is provided and correspond to a broken session. """ def __init__(self, req, sid=None): self._remember_me = False self._req, self._sid, self._secret = req, sid, None self._lock = CFG_WEBSESSION_ENABLE_LOCKING self._new = 1 self._created = 0 self._accessed = 0 self._timeout = 0 self._locked = 0 self._invalid = 0 self._http_ip = None self._https_ip = None dict.__init__(self) if not self._sid: # check to see if cookie exists cookie = get_cookie(req, CFG_WEBSESSION_COOKIE_NAME) if cookie: self._sid = cookie.value if self._sid: if not _check_sid(self._sid): if sid: # Supplied explicitly by user of the class, # raise an exception and make the user code # deal with it. raise ValueError("Invalid Session ID: sid=%s" % sid) else: # Derived from the cookie sent by browser, # wipe it out so it gets replaced with a # correct value. self._sid = None if self._sid: # attempt to load ourselves self.lock() if self.load(): self._new = 0 if self._new: # make a new session if self._sid: self.unlock() # unlock old sid self._sid = _new_sid(self._req) self.lock() # lock new sid - remote_ip = self._req.connection.remote_ip + remote_ip = self._req.remote_ip if self._req.is_https(): self._https_ip = remote_ip else: self._http_ip = remote_ip add_cookie(self._req, self.make_cookie()) self._created = time.time() self._timeout = CFG_WEBSESSION_EXPIRY_LIMIT_DEFAULT * \ CFG_WEBSESSION_ONE_DAY self._accessed = time.time() # need cleanup? if random.randint(1, CFG_WEBSESSION_CLEANUP_CHANCE) == 1: self.cleanup() def set_remember_me(self, remember_me=True): """ Set/Unset the L{_remember_me} flag. @param remember_me: True if the session cookie should last one day or until the browser is closed. @type remember_me: bool """ self._remember_me = remember_me if remember_me: self.set_timeout(CFG_WEBSESSION_EXPIRY_LIMIT_REMEMBER * CFG_WEBSESSION_ONE_DAY) else: self.set_timeout(CFG_WEBSESSION_EXPIRY_LIMIT_DEFAULT * CFG_WEBSESSION_ONE_DAY) add_cookie(self._req, self.make_cookie()) def load(self): """ Load the session from the database. @return: 1 in case of success, 0 otherwise. @rtype: integer """ session_dict = None invalid = False - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_lock(self._req.server, None, 0) - try: - res = run_sql("SELECT session_object FROM session " - "WHERE session_key=%s", (self._sid, )) - if res: - session_dict = cPickle.loads(blob_to_string(res[0][0])) - remote_ip = self._req.connection.remote_ip - if self._req.is_https(): - if session_dict['_https_ip'] is not None and \ - session_dict['_https_ip'] != remote_ip: - invalid = True - else: - session_dict['_https_ip'] = remote_ip + res = run_sql("SELECT session_object FROM session " + "WHERE session_key=%s", (self._sid, )) + if res: + session_dict = cPickle.loads(blob_to_string(res[0][0])) + remote_ip = self._req.remote_ip + if self._req.is_https(): + if session_dict['_https_ip'] is not None and \ + session_dict['_https_ip'] != remote_ip: + invalid = True + else: + session_dict['_https_ip'] = remote_ip + else: + if session_dict['_http_ip'] is not None and \ + session_dict['_http_ip'] != remote_ip: + invalid = True else: - if session_dict['_http_ip'] is not None and \ - session_dict['_http_ip'] != remote_ip: - invalid = True - else: - session_dict['_http_ip'] = remote_ip - finally: - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_unlock(self._req.server, None, 0) + session_dict['_http_ip'] = remote_ip if session_dict is None: return 0 if invalid: return 0 if (time.time() - session_dict["_accessed"]) > \ session_dict["_timeout"]: return 0 self._created = session_dict["_created"] self._accessed = session_dict["_accessed"] self._timeout = session_dict["_timeout"] self._remember_me = session_dict["_remember_me"] self.update(session_dict["_data"]) return 1 def save(self): """ Save the session to the database. """ if not self._invalid: session_dict = {"_data" : self.copy(), "_created" : self._created, "_accessed": self._accessed, "_timeout" : self._timeout, "_http_ip" : self._http_ip, "_https_ip" : self._https_ip, "_remember_me" : self._remember_me } - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_lock(self._req.server, None, 0) - try: - session_key = self._sid - session_object = cPickle.dumps(session_dict, -1) - session_expiry = time.time() + self._timeout + \ - CFG_WEBSESSION_ONE_DAY - uid = self.get('uid', -1) - run_sql(""" - INSERT session( - session_key, - session_expiry, - session_object, - uid - ) VALUE(%s, - %s, - %s, - %s - ) ON DUPLICATE KEY UPDATE - session_expiry=%s, - session_object=%s, - uid=%s - """, (session_key, session_expiry, session_object, uid, - session_expiry, session_object, uid)) - finally: - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_unlock(self._req.server, None, 0) + session_key = self._sid + session_object = cPickle.dumps(session_dict, -1) + session_expiry = time.time() + self._timeout + \ + CFG_WEBSESSION_ONE_DAY + uid = self.get('uid', -1) + run_sql(""" + INSERT session( + session_key, + session_expiry, + session_object, + uid + ) VALUE(%s, + %s, + %s, + %s + ) ON DUPLICATE KEY UPDATE + session_expiry=%s, + session_object=%s, + uid=%s + """, (session_key, session_expiry, session_object, uid, + session_expiry, session_object, uid)) def delete(self): """ Delete the session. """ - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_lock(self._req.server, None, 0) - try: - run_sql("DELETE FROM session WHERE session_key=%s", (self._sid, )) - finally: - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_unlock(self._req.server, None, 0) + run_sql("DELETE FROM session WHERE session_key=%s", (self._sid, )) self.clear() def invalidate(self): """ Declare the session as invalid. """ cookie = self.make_cookie() cookie.expires = 0 add_cookie(self._req, cookie) self.delete() self._invalid = 1 if hasattr(self._req, '_session'): delattr(self._req, '_session') def make_cookie(self): """ Reimplementation of L{BaseSession.make_cookie} method, that also consider the L{_remember_me} flag @return: a session cookie. @rtpye: {mod_python.Cookie.Cookie} """ cookie = Cookie(CFG_WEBSESSION_COOKIE_NAME, self._sid) cookie.path = '/' if self._remember_me: cookie.expires = time.time() + self._timeout return cookie def initial_http_ip(self): """ @return: the initial ip addressed for the HTTP protocol for which this session was issued. @rtype: string @note: it returns None if this session has always been used through HTTPS requests. """ return self._http_ip def initial_https_ip(self): """ @return: the initial ip addressed for the HTTPS protocol for which this session was issued. @rtype: string @note: it returns None if this session has always been used through HTTP requests. """ return self._https_ip def lock(self): """ Lock the session. """ if self._lock: - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_lock(self._req.server, self._sid) - self._req.register_cleanup(_unlock_session_cleanup, self) self._locked = 1 def unlock(self): """ Unlock the session. """ if self._lock and self._locked: - if CFG_WEBSESSION_ENABLE_LOCKING: - _apache._global_unlock(self._req.server, self._sid) self._locked = 0 def is_new(self): """ @return: True if the session has just been created. @rtype: bool """ return not not self._new def sid(self): """ @return: the session identifier. @rtype: 32 hexadecimal string """ return self._sid def created(self): """ @return: the UNIX timestamp for when the session has been created. @rtype: double """ return self._created def last_accessed(self): """ @return: the UNIX timestamp for when the session has been last accessed. @rtype: double """ return self._accessed def timeout(self): """ @return: the number of seconds from the last accessed timestamp, after which the session is invalid. @rtype: double """ return self._timeout def set_timeout(self, secs): """ Set the number of seconds from the last accessed timestamp, after which the session is invalid. @param secs: the number of seconds. @type secs: double """ self._timeout = secs def cleanup(self): """ Perform the database session cleanup. """ def session_cleanup(): """ Session cleanup procedure which to be executed at the end of the request handling. """ run_sql(""" DELETE FROM session WHERE session_expiry<=UNIX_TIMESTAMP() """) self._req.register_cleanup(session_cleanup) self._req.log_error("InvenioSession: registered database cleanup.", APLOG_NOTICE) def __del__(self): self.unlock() def _unlock_session_cleanup(session): """ Auxliary function to unlock a session. """ session.unlock() def _init_rnd(): """ Initialize random number generators. This is key in multithreaded env, see Python docs for random. @return: the generators. @rtype: list of generators """ # query max number of threads - - if _apache.mpm_query(AP_MPMQ_IS_THREADED): - gennum = _apache.mpm_query(AP_MPMQ_MAX_SPARE_THREADS) - else: - gennum = 10 + gennum = 10 # make generators # this bit is from Python lib reference random_generator = random.Random(time.time()) result = [random_generator] for dummy in range(gennum - 1): laststate = random_generator.getstate() random_generator = random.Random() random_generator.setstate(laststate) random_generator.jumpahead(1000000) result.append(random_generator) return result -if CFG_IN_APACHE: - _RANDOM_GENERATORS = _init_rnd() - _RANDOM_ITERATOR = iter(_RANDOM_GENERATORS) +_RANDOM_GENERATORS = _init_rnd() +_RANDOM_ITERATOR = iter(_RANDOM_GENERATORS) def _get_generator(): """ get rnd_iter.next(), or start over if we reached the end of it @return: the next random number. @rtype: double """ global _RANDOM_ITERATOR try: return _RANDOM_ITERATOR.next() except StopIteration: # the small potential for two threads doing this # seems does not warrant use of a lock _RANDOM_ITERATOR = iter(_RANDOM_GENERATORS) return _RANDOM_ITERATOR.next() _RE_VALIDATE_SID = re.compile('[0-9a-f]{32}$') def _check_sid(sid): """ Check the validity of the session identifier. The sid must be 32 characters long, and consisting of the characters 0-9 and a-f. The sid may be passed in a cookie from the client and as such should not be trusted. This is particularly important in FileSession, where the session filename is derived from the sid. A sid containing '/' or '.' characters could result in a directory traversal attack @param sid: the session identifier. @type sid: string @return: True if the session identifier is valid. @rtype: bool """ return not not _RE_VALIDATE_SID.match(sid) def _new_sid(req): """ Make a number based on current time, pid, remote ip and two random ints, then hash with md5. This should be fairly unique and very difficult to guess. @param req: the mod_python request object. @type req: mod_python request object. @return: the session identifier. @rtype: 32 hexadecimal string @warning: The current implementation of _new_sid returns an md5 hexdigest string. To avoid a possible directory traversal attack in FileSession the sid is validated using the _check_sid() method and the compiled regex validate_sid_re. The sid will be accepted only if len(sid) == 32 and it only contains the characters 0-9 and a-f. If you change this implementation of _new_sid, make sure to also change the validation scheme, as well as the test_Session_illegal_sid() unit test in test/test.py. """ the_time = long(time.time()*10000) pid = os.getpid() random_generator = _get_generator() rnd1 = random_generator.randint(0, 999999999) rnd2 = random_generator.randint(0, 999999999) - remote_ip = req.connection.remote_ip + remote_ip = req.remote_ip return md5("%d%d%d%d%s" % ( the_time, pid, rnd1, rnd2, remote_ip) ).hexdigest() diff --git a/modules/websession/lib/websession_webinterface.py b/modules/websession/lib/websession_webinterface.py index 9da3b197d..cc04afaab 100644 --- a/modules/websession/lib/websession_webinterface.py +++ b/modules/websession/lib/websession_webinterface.py @@ -1,1291 +1,1291 @@ # -*- coding: utf-8 -*- ## ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """CDS Invenio ACCOUNT HANDLING""" __revision__ = "$Id$" __lastupdated__ = """$Date$""" import cgi import os from datetime import timedelta from invenio.config import \ CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS, \ CFG_ACCESS_CONTROL_LEVEL_SITE, \ CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT, \ CFG_SITE_NAME, \ CFG_SITE_NAME_INTL, \ CFG_SITE_SUPPORT_EMAIL, \ CFG_SITE_SECURE_URL, \ CFG_SITE_URL, \ CFG_CERN_SITE, \ CFG_WEBSESSION_RESET_PASSWORD_EXPIRE_IN_DAYS from invenio import webuser from invenio.webpage import page from invenio import webaccount from invenio import webbasket from invenio import webalert from invenio.dbquery import run_sql from invenio.webmessage import account_new_mail from invenio.access_control_engine import make_apache_message, make_list_apache_firerole, acc_authorize_action from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory from invenio.urlutils import redirect_to_url, make_canonical_urlargd from invenio import webgroup from invenio import bibcatalog_system from invenio import webgroup_dblayer from invenio.messages import gettext_set_language, wash_language from invenio.mailutils import send_email from invenio.access_control_mailcookie import mail_cookie_retrieve_kind, \ mail_cookie_check_pw_reset, mail_cookie_delete_cookie, \ mail_cookie_create_pw_reset, mail_cookie_check_role, \ mail_cookie_check_mail_activation, InvenioWebAccessMailCookieError, \ InvenioWebAccessMailCookieDeletedError, mail_cookie_check_authorize_action from invenio.access_control_config import CFG_WEBACCESS_WARNING_MSGS, \ CFG_EXTERNAL_AUTH_USING_SSO, CFG_EXTERNAL_AUTH_LOGOUT_SSO, \ CFG_EXTERNAL_AUTHENTICATION import invenio.template websession_templates = invenio.template.load('websession') bibcatalog_templates = invenio.template.load('bibcatalog') class WebInterfaceYourAccountPages(WebInterfaceDirectory): _exports = ['', 'edit', 'change', 'lost', 'display', 'send_email', 'youradminactivities', 'access', 'delete', 'logout', 'login', 'register', 'resetpassword'] _force_https = True def index(self, req, form): redirect_to_url(req, '%s/youraccount/display' % CFG_SITE_SECURE_URL) def access(self, req, form): args = wash_urlargd(form, {'mailcookie' : (str, '')}) _ = gettext_set_language(args['ln']) title = _("Mail Cookie Service") try: kind = mail_cookie_retrieve_kind(args['mailcookie']) if kind == 'pw_reset': redirect_to_url(req, '%s/youraccount/resetpassword?k=%s&ln=%s' % (CFG_SITE_SECURE_URL, args['mailcookie'], args['ln'])) elif kind == 'role': uid = webuser.getUid(req) try: (role_name, expiration) = mail_cookie_check_role(args['mailcookie'], uid) except InvenioWebAccessMailCookieDeletedError: return page(title=_("Role authorization request"), req=req, body=_("This request for an authorization has already been authorized."), uid=webuser.getUid(req), navmenuid='youraccount', language=args['ln']) return page(title=title, body=webaccount.perform_back( _("You have successfully obtained an authorization as %(x_role)s! " "This authorization will last until %(x_expiration)s and until " "you close your browser if you are a guest user.") % {'x_role' : '<strong>%s</strong>' % role_name, 'x_expiration' : '<em>%s</em>' % expiration.strftime("%Y-%m-%d %H:%M:%S")}, '/youraccount/display?ln=%s' % args['ln'], _('login'), args['ln']), req=req, uid=webuser.getUid(req), language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') elif kind == 'mail_activation': try: email = mail_cookie_check_mail_activation(args['mailcookie']) if not email: raise StandardError webuser.confirm_email(email) body = "<p>" + _("You have confirmed the validity of your email" " address!") + "</p>" if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 1: body += "<p>" + _("Please, wait for the administrator to " "enable your account.") + "</p>" else: uid = webuser.update_Uid(req, email) body += "<p>" + _("You can now go to %(x_url_open)syour account page%(x_url_close)s.") % {'x_url_open' : '<a href="/youraccount/display?ln=%s">' % args['ln'], 'x_url_close' : '</a>'} + "</p>" return page(title=_("Email address successfully activated"), body=body, req=req, language=args['ln'], uid=webuser.getUid(req), lastupdated=__lastupdated__, navmenuid='youraccount') except InvenioWebAccessMailCookieDeletedError, e: body = "<p>" + _("You have already confirmed the validity of your email address!") + "</p>" if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 1: body += "<p>" + _("Please, wait for the administrator to " "enable your account.") + "</p>" else: body += "<p>" + _("You can now go to %(x_url_open)syour account page%(x_url_close)s.") % {'x_url_open' : '<a href="/youraccount/display?ln=%s">' % args['ln'], 'x_url_close' : '</a>'} + "</p>" return page(title=_("Email address successfully activated"), body=body, req=req, language=args['ln'], uid=webuser.getUid(req), lastupdated=__lastupdated__, navmenuid='youraccount') return webuser.page_not_authorized(req, "../youraccount/access", text=_("This request for confirmation of an email " "address is not valid or" " is expired."), navmenuid='youraccount') except InvenioWebAccessMailCookieError: return webuser.page_not_authorized(req, "../youraccount/access", text=_("This request for an authorization is not valid or" " is expired."), navmenuid='youraccount') def resetpassword(self, req, form): args = wash_urlargd(form, { 'k' : (str, ''), 'reset' : (int, 0), 'password' : (str, ''), 'password2' : (str, '') }) _ = gettext_set_language(args['ln']) email = mail_cookie_check_pw_reset(args['k']) reset_key = args['k'] title = _('Reset password') if email is None or CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 3: return webuser.page_not_authorized(req, "../youraccount/resetpassword", text=_("This request for resetting the password is not valid or" " is expired."), navmenuid='youraccount') if not args['reset']: return page(title=title, body=webaccount.perform_reset_password(args['ln'], email, reset_key), req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') elif args['password'] != args['password2']: msg = _('The two provided passwords aren\'t equal.') return page(title=title, body=webaccount.perform_reset_password(args['ln'], email, reset_key, msg), req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') run_sql('UPDATE user SET password=AES_ENCRYPT(email,%s) WHERE email=%s', (args['password'], email)) mail_cookie_delete_cookie(reset_key) return page(title=title, body=webaccount.perform_back( _("The password was successfully set! " "You can now proceed with the login."), '/youraccount/login?ln=%s' % args['ln'], _('login'), args['ln']), req=req, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def display(self, req, form): args = wash_urlargd(form, {}) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(args['ln']) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/display", navmenuid='youraccount') if webuser.isGuestUser(uid): return page(title=_("Your Account"), body=webaccount.perform_info(req, args['ln']), description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') username = webuser.get_nickname_or_email(uid) user_info = webuser.collect_user_info(req) bask = user_info['precached_usebaskets'] and webbasket.account_list_baskets(uid, ln=args['ln']) or '' aler = user_info['precached_usealerts'] and webalert.account_list_alerts(uid, ln=args['ln']) or '' sear = webalert.account_list_searches(uid, ln=args['ln']) msgs = user_info['precached_usemessages'] and account_new_mail(uid, ln=args['ln']) or '' grps = user_info['precached_usegroups'] and webgroup.account_group(uid, ln=args['ln']) or '' appr = user_info['precached_useapprove'] sbms = user_info['precached_viewsubmissions'] loan = '' admn = webaccount.perform_youradminactivities(user_info, args['ln']) return page(title=_("Your Account"), body=webaccount.perform_display_account(req, username, bask, aler, sear, msgs, loan, grps, sbms, appr, admn, args['ln']), description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def edit(self, req, form): args = wash_urlargd(form, {"verbose" : (int, 0)}) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(args['ln']) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/edit", navmenuid='youraccount') if webuser.isGuestUser(uid): return webuser.page_not_authorized(req, "../youraccount/edit", text=_("This functionality is forbidden to guest users."), navmenuid='youraccount') body = '' user_info = webuser.collect_user_info(req) if args['verbose'] == 9: keys = user_info.keys() keys.sort() for key in keys: body += "<b>%s</b>:%s<br />" % (key, user_info[key]) #check if the user should see bibcatalog user name / passwd in the settings can_config_bibcatalog = (acc_authorize_action(user_info, 'runbibedit')[0] == 0) return page(title= _("Your Settings"), body=body+webaccount.perform_set(webuser.get_email(uid), args['ln'], can_config_bibcatalog, verbose=args['verbose']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description=_("%s Personalize, Your Settings") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def change(self, req, form): args = wash_urlargd(form, { 'nickname': (str, None), 'email': (str, None), 'old_password': (str, None), 'password': (str, None), 'password2': (str, None), 'login_method': (str, ""), 'group_records' : (int, None), 'latestbox' : (int, None), 'helpbox' : (int, None), 'lang' : (str, None), 'bibcatalog_username' : (str, None), 'bibcatalog_password' : (str, None), }) ## Wash arguments: args['login_method'] = wash_login_method(args['login_method']) if args['email']: args['email'] = args['email'].lower() ## Load the right message language: _ = gettext_set_language(args['ln']) ## Identify user and load old preferences: uid = webuser.getUid(req) prefs = webuser.get_user_preferences(uid) ## Check rights: if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/change", navmenuid='youraccount') # FIXME: the branching below is far from optimal. Should be # based on the submitted form name ids, to know precisely on # which form the user clicked. Not on the passed values, as # is the case now. The function body is too big and in bad # need of refactoring anyway. ## Will hold the output messages: mess = '' ## Change login method if needed: if args['login_method'] and CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS < 4 \ and args['login_method'] in CFG_EXTERNAL_AUTHENTICATION.keys(): title = _("Settings edited") act = "/youraccount/display?ln=%s" % args['ln'] linkname = _("Show account") if prefs['login_method'] != args['login_method']: if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 4: mess += '<p>' + _("Unable to change login method.") elif not CFG_EXTERNAL_AUTHENTICATION[args['login_method']][0]: # Switching to internal authentication: we drop any external datas p_email = webuser.get_email(uid) webuser.drop_external_settings(uid) webgroup_dblayer.drop_external_groups(uid) prefs['login_method'] = args['login_method'] webuser.set_user_preferences(uid, prefs) mess += "<p>" + _("Switched to internal login method.") + " " mess += _("Please note that if this is the first time that you are using this account " "with the internal login method then the system has set for you " "a randomly generated password. Please click the " "following button to obtain a password reset request " "link sent to you via email:") + '</p>' mess += """<p><form method="post" action="../youraccount/send_email"> <input type="hidden" name="p_email" value="%s"> <input class="formbutton" type="submit" value="%s"> </form></p>""" % (p_email, _("Send Password")) else: query = """SELECT email FROM user WHERE id = %i""" res = run_sql(query % uid) if res: email = res[0][0] else: email = None if not email: mess += '<p>' + _("Unable to switch to external login method %s, because your email address is unknown.") % cgi.escape(args['login_method']) else: try: if not CFG_EXTERNAL_AUTHENTICATION[args['login_method']][0].user_exists(email): mess += '<p>' + _("Unable to switch to external login method %s, because your email address is unknown to the external login system.") % cgi.escape(args['login_method']) else: prefs['login_method'] = args['login_method'] webuser.set_user_preferences(uid, prefs) mess += '<p>' + _("Login method successfully selected.") except AttributeError: mess += '<p>' + _("The external login method %s does not support email address based logins. Please contact the site administrators.") % cgi.escape(args['login_method']) ## Change email or nickname: if args['email'] or args['nickname']: uid2 = webuser.emailUnique(args['email']) uid_with_the_same_nickname = webuser.nicknameUnique(args['nickname']) if (CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 2 or (CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS <= 1 and \ webuser.email_valid_p(args['email']))) \ and (args['nickname'] is None or webuser.nickname_valid_p(args['nickname'])) \ and uid2 != -1 and (uid2 == uid or uid2 == 0) \ and uid_with_the_same_nickname != -1 and (uid_with_the_same_nickname == uid or uid_with_the_same_nickname == 0): if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS < 3: change = webuser.updateDataUser(uid, args['email'], args['nickname']) else: return webuser.page_not_authorized(req, "../youraccount/change", navmenuid='youraccount') if change: mess += '<p>' + _("Settings successfully edited.") mess += '<p>' + _("Note that if you have changed your email address, " "you will have to %(x_url_open)sreset your password%(x_url_close)s anew.") % \ {'x_url_open': '<a href="%s">' % (CFG_SITE_SECURE_URL + '/youraccount/lost?ln=%s' % args['ln']), 'x_url_close': '</a>'} act = "/youraccount/display?ln=%s" % args['ln'] linkname = _("Show account") title = _("Settings edited") elif args['nickname'] is not None and not webuser.nickname_valid_p(args['nickname']): mess += '<p>' + _("Desired nickname %s is invalid.") % cgi.escape(args['nickname']) mess += " " + _("Please try again.") act = "/youraccount/edit?ln=%s" % args['ln'] linkname = _("Edit settings") title = _("Editing settings failed") elif not webuser.email_valid_p(args['email']): mess += '<p>' + _("Supplied email address %s is invalid.") % cgi.escape(args['email']) mess += " " + _("Please try again.") act = "/youraccount/edit?ln=%s" % args['ln'] linkname = _("Edit settings") title = _("Editing settings failed") elif uid2 == -1 or uid2 != uid and not uid2 == 0: mess += '<p>' + _("Supplied email address %s already exists in the database.") % cgi.escape(args['email']) mess += " " + websession_templates.tmpl_lost_your_password_teaser(args['ln']) mess += " " + _("Or please try again.") act = "/youraccount/edit?ln=%s" % args['ln'] linkname = _("Edit settings") title = _("Editing settings failed") elif uid_with_the_same_nickname == -1 or uid_with_the_same_nickname != uid and not uid_with_the_same_nickname == 0: mess += '<p>' + _("Desired nickname %s is already in use.") % cgi.escape(args['nickname']) mess += " " + _("Please try again.") act = "/youraccount/edit?ln=%s" % args['ln'] linkname = _("Edit settings") title = _("Editing settings failed") ## Change passwords: if args['old_password'] or args['password'] or args['password2']: if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 3: mess += '<p>' + _("Users cannot edit passwords on this site.") else: res = run_sql("SELECT id FROM user " "WHERE AES_ENCRYPT(email,%s)=password AND id=%s", (args['old_password'], uid)) if res: if args['password'] == args['password2']: webuser.updatePasswordUser(uid, args['password']) mess += '<p>' + _("Password successfully edited.") act = "/youraccount/display?ln=%s" % args['ln'] linkname = _("Show account") title = _("Password edited") else: mess += '<p>' + _("Both passwords must match.") mess += " " + _("Please try again.") act = "/youraccount/edit?ln=%s" % args['ln'] linkname = _("Edit settings") title = _("Editing password failed") else: mess += '<p>' + _("Wrong old password inserted.") mess += " " + _("Please try again.") act = "/youraccount/edit?ln=%s" % args['ln'] linkname = _("Edit settings") title = _("Editing password failed") ## Change search-related settings: if args['group_records']: prefs = webuser.get_user_preferences(uid) prefs['websearch_group_records'] = args['group_records'] prefs['websearch_latestbox'] = args['latestbox'] prefs['websearch_helpbox'] = args['helpbox'] webuser.set_user_preferences(uid, prefs) title = _("Settings edited") act = "/youraccount/display?ln=%s" % args['ln'] linkname = _("Show account") mess += '<p>' + _("User settings saved correctly.") ## Change language-related settings: if args['lang']: lang = wash_language(args['lang']) prefs = webuser.get_user_preferences(uid) prefs['language'] = lang args['ln'] = lang _ = gettext_set_language(lang) webuser.set_user_preferences(uid, prefs) title = _("Settings edited") act = "/youraccount/display?ln=%s" % args['ln'] linkname = _("Show account") mess += '<p>' + _("User settings saved correctly.") ## Edit cataloging-related settings: if args['bibcatalog_username'] or args['bibcatalog_password']: act = "/youraccount/display?ln=%s" % args['ln'] linkname = _("Show account") if ((len(args['bibcatalog_username']) == 0) or (len(args['bibcatalog_password']) == 0)): title = _("Editing bibcatalog authorization failed") mess += '<p>' + _("Empty username or password") else: title = _("Settings edited") prefs['bibcatalog_username'] = args['bibcatalog_username'] prefs['bibcatalog_password'] = args['bibcatalog_password'] webuser.set_user_preferences(uid, prefs) mess += '<p>' + _("User settings saved correctly.") if not mess: mess = _("Unable to update settings.") if not act: act = "/youraccount/edit?ln=%s" % args['ln'] if not linkname: linkname = _("Edit settings") if not title: title = _("Editing settings failed") ## Finally, output the results: return page(title=title, body=webaccount.perform_back(mess, act, linkname, args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def lost(self, req, form): args = wash_urlargd(form, {}) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(args['ln']) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/lost", navmenuid='youraccount') return page(title=_("Lost your password?"), body=webaccount.perform_lost(args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def send_email(self, req, form): # set all the declared query fields as local variables args = wash_urlargd(form, {'p_email': (str, None)}) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(args['ln']) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/send_email", navmenuid='youraccount') user_prefs = webuser.get_user_preferences(webuser.emailUnique(args['p_email'])) if user_prefs: if CFG_EXTERNAL_AUTHENTICATION.has_key(user_prefs['login_method']) and \ CFG_EXTERNAL_AUTHENTICATION[user_prefs['login_method']][0] is not None: eMsg = _("Cannot send password reset request since you are using external authentication system.") return page(title=_("Your Account"), body=webaccount.perform_emailMessage(eMsg, args['ln']), description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME)), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') try: reset_key = mail_cookie_create_pw_reset(args['p_email'], cookie_timeout=timedelta(days=CFG_WEBSESSION_RESET_PASSWORD_EXPIRE_IN_DAYS)) except InvenioWebAccessMailCookieError: reset_key = None if reset_key is None: eMsg = _("The entered email address does not exist in the database.") return page(title=_("Your Account"), body=webaccount.perform_emailMessage(eMsg, args['ln']), description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') - ip_address = req.connection.remote_host or req.connection.remote_ip + ip_address = req.remote_host or req.remote_ip if not send_email(CFG_SITE_SUPPORT_EMAIL, args['p_email'], "%s %s" % (_("Password reset request for"), CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME)), websession_templates.tmpl_account_reset_password_email_body( args['p_email'],reset_key, ip_address, args['ln'])): eMsg = _("The entered email address is incorrect, please check that it is written correctly (e.g. johndoe@example.com).") return page(title=_("Incorrect email address"), body=webaccount.perform_emailMessage(eMsg, args['ln']), description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') return page(title=_("Reset password link sent"), body=webaccount.perform_emailSent(args['p_email'], args['ln']), description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def youradminactivities(self, req, form): args = wash_urlargd(form, {}) uid = webuser.getUid(req) user_info = webuser.collect_user_info(req) # load the right message language _ = gettext_set_language(args['ln']) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/youradminactivities", navmenuid='admin') return page(title=_("Your Administrative Activities"), body=webaccount.perform_youradminactivities(user_info, args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='admin') def delete(self, req, form): args = wash_urlargd(form, {}) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(args['ln']) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/delete", navmenuid='youraccount') return page(title=_("Delete Account"), body=webaccount.perform_delete(args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def logout(self, req, form): args = wash_urlargd(form, {}) uid = webuser.logoutUser(req) # load the right message language _ = gettext_set_language(args['ln']) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../youraccount/logout", navmenuid='youraccount') if CFG_EXTERNAL_AUTH_USING_SSO: return redirect_to_url(req, CFG_EXTERNAL_AUTH_LOGOUT_SSO) return page(title=_("Logout"), body=webaccount.perform_logout(req, args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def login(self, req, form): args = wash_urlargd(form, { 'p_un': (str, None), 'p_pw': (str, None), 'login_method': (str, None), 'action': (str, ''), 'remember_me' : (str, ''), 'referer': (str, '')}) # sanity checks: args['login_method'] = wash_login_method(args['login_method']) if args['p_un']: args['p_un'] = args['p_un'].strip() args['remember_me'] = args['remember_me'] != '' locals().update(args) if CFG_ACCESS_CONTROL_LEVEL_SITE > 0: return webuser.page_not_authorized(req, "../youraccount/login?ln=%s" % args['ln'], navmenuid='youraccount') uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(args['ln']) apache_msg = "" if args['action']: cookie = args['action'] try: action, arguments = mail_cookie_check_authorize_action(cookie) apache_msg = make_apache_message(action, arguments, args['referer']) # FIXME: Temporary Hack to help CDS current migration if CFG_CERN_SITE: roles = make_list_apache_firerole(action, arguments) if len(roles) == 1: # There's only one role enabled to see this collection # Let's redirect to log to it! return redirect_to_url(req, '%s/%s' % (CFG_SITE_SECURE_URL, make_canonical_urlargd({'realm' : roles[0][0], 'referer' : args['referer']}, {}))) except InvenioWebAccessMailCookieError: pass if not CFG_EXTERNAL_AUTH_USING_SSO: if args['p_un'] is None or not args['login_method']: return page(title=_("Login"), body=webaccount.create_login_page_box(args['referer'], apache_msg, args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') (iden, args['p_un'], args['p_pw'], msgcode) = webuser.loginUser(req, args['p_un'], args['p_pw'], args['login_method']) else: # Fake parameters for p_un & p_pw because SSO takes them from the environment (iden, args['p_un'], args['p_pw'], msgcode) = webuser.loginUser(req, '', '', CFG_EXTERNAL_AUTH_USING_SSO) args['remember_me'] = False if len(iden)>0: uid = webuser.update_Uid(req, args['p_un'], args['remember_me']) uid2 = webuser.getUid(req) if uid2 == -1: webuser.logoutUser(req) return webuser.page_not_authorized(req, "../youraccount/login?ln=%s" % args['ln'], uid=uid, navmenuid='youraccount') # login successful! if args['referer']: redirect_to_url(req, args['referer']) else: return self.display(req, form) else: mess = CFG_WEBACCESS_WARNING_MSGS[msgcode] % cgi.escape(args['login_method']) if msgcode == 14: if webuser.username_exists_p(args['p_un']): mess = CFG_WEBACCESS_WARNING_MSGS[15] % cgi.escape(args['login_method']) act = '/youraccount/login%s' % make_canonical_urlargd({'ln' : args['ln'], 'referer' : args['referer']}, {}) return page(title=_("Login"), body=webaccount.perform_back(mess, act, _("login"), args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') def register(self, req, form): args = wash_urlargd(form, { 'p_nickname': (str, None), 'p_email': (str, None), 'p_pw': (str, None), 'p_pw2': (str, None), 'action': (str, "login"), 'referer': (str, "")}) if CFG_ACCESS_CONTROL_LEVEL_SITE > 0: return webuser.page_not_authorized(req, "../youraccount/register?ln=%s" % args['ln'], navmenuid='youraccount') uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(args['ln']) if args['p_nickname'] is None or args['p_email'] is None: return page(title=_("Register"), body=webaccount.create_register_page_box(args['referer'], args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description=_("%s Personalize, Main page") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') mess = "" act = "" if args['p_pw'] == args['p_pw2']: ruid = webuser.registerUser(req, args['p_email'], args['p_pw'], args['p_nickname'], ln=args['ln']) else: ruid = -2 if ruid == 0: mess = _("Your account has been successfully created.") title = _("Account created") if CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT == 1: mess += " " + _("In order to confirm its validity, an email message containing an account activation key has been sent to the given email address.") mess += " " + _("Please follow instructions presented there in order to complete the account registration process.") if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 1: mess += " " + _("A second email will be sent when the account has been activated and can be used.") elif CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT != 1: uid = webuser.update_Uid(req, args['p_email']) mess += " " + _("You can now access your %(x_url_open)saccount%(x_url_close)s.") %\ {'x_url_open': '<a href="' + CFG_SITE_SECURE_URL + '/youraccount/display?ln=' + args['ln'] + '">', 'x_url_close': '</a>'} elif ruid == -2: mess = _("Both passwords must match.") mess += " " + _("Please try again.") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") elif ruid == 1: mess = _("Supplied email address %s is invalid.") % cgi.escape(args['p_email']) mess += " " + _("Please try again.") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") elif ruid == 2: mess = _("Desired nickname %s is invalid.") % cgi.escape(args['p_nickname']) mess += " " + _("Please try again.") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") elif ruid == 3: mess = _("Supplied email address %s already exists in the database.") % cgi.escape(args['p_email']) mess += " " + websession_templates.tmpl_lost_your_password_teaser(args['ln']) mess += " " + _("Or please try again.") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") elif ruid == 4: mess = _("Desired nickname %s already exists in the database.") % cgi.escape(args['p_nickname']) mess += " " + _("Please try again.") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") elif ruid == 5: mess = _("Users cannot register themselves, only admin can register them.") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") elif ruid == 6: mess = _("The site is having troubles in sending you an email for confirming your email address.") + _("The error has been logged and will be taken in consideration as soon as possible.") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") else: # this should never happen mess = _("Internal Error") act = "/youraccount/register?ln=%s" % args['ln'] title = _("Registration failure") return page(title=title, body=webaccount.perform_back(mess,act, _("register"), args['ln']), navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""", description=_("%s Personalize, Main page") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME), uid=uid, req=req, secure_page_p = 1, language=args['ln'], lastupdated=__lastupdated__, navmenuid='youraccount') class WebInterfaceYourTicketsPages(WebInterfaceDirectory): #support for /yourtickets url _exports = ['', 'display'] def __call__(self, req, form): #if there is no trailing slash self.index(req, form) def index(self, req, form): #take all the parameters.. unparsed_uri = req.unparsed_uri qstr = "" if unparsed_uri.count('?') > 0: dummy, qstr = unparsed_uri.split('?') qstr = '?'+qstr redirect_to_url(req, '/yourtickets/display'+qstr) def display(self, req, form): #show tickets for this user argd = wash_urlargd(form, {'ln': (str, ''), 'start': (int, 1) }) uid = webuser.getUid(req) ln = argd['ln'] start = argd['start'] _ = gettext_set_language(ln) body = bibcatalog_templates.tmpl_your_tickets(uid, ln, start) return page(title=_("Your tickets"), body=body, navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, argd['ln']) + _("Your Account") + """</a>""", uid=uid, req=req, language=argd['ln'], lastupdated=__lastupdated__) class WebInterfaceYourGroupsPages(WebInterfaceDirectory): _exports = ['', 'display', 'create', 'join', 'leave', 'edit', 'members'] def index(self, req, form): redirect_to_url(req, '/yourgroups/display') def display(self, req, form): """ Displays groups the user is admin of and the groups the user is member of(but not admin) @param ln: language @return: the page for all the groups """ argd = wash_urlargd(form, {}) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(argd['ln']) if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../yourgroups/display", navmenuid='yourgroups') user_info = webuser.collect_user_info(req) if not user_info['precached_usegroups']: return webuser.page_not_authorized(req, "../", \ text = _("You are not authorized to use groups.")) (body, errors, warnings) = webgroup.perform_request_groups_display(uid=uid, ln=argd['ln']) return page(title = _("Your Groups"), body = body, navtrail = webgroup.get_navtrail(argd['ln']), uid = uid, req = req, language = argd['ln'], lastupdated = __lastupdated__, errors = errors, warnings = warnings, navmenuid = 'yourgroups') def create(self, req, form): """create(): interface for creating a new group @param group_name: : name of the new webgroup.Must be filled @param group_description: : description of the new webgroup.(optionnal) @param join_policy: : join policy of the new webgroup.Must be chosen @param *button: which button was pressed @param ln: language @return: the compose page Create group """ argd = wash_urlargd(form, {'group_name': (str, ""), 'group_description': (str, ""), 'join_policy': (str, ""), 'create_button':(str, ""), 'cancel':(str, "") }) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(argd['ln']) if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../yourgroups/create", navmenuid='yourgroups') user_info = webuser.collect_user_info(req) if not user_info['precached_usegroups']: return webuser.page_not_authorized(req, "../", \ text = _("You are not authorized to use groups.")) if argd['cancel']: url = CFG_SITE_URL + '/yourgroups/display?ln=%s' url %= argd['ln'] redirect_to_url(req, url) if argd['create_button'] : (body, errors, warnings)= webgroup.perform_request_create_group(uid=uid, group_name=argd['group_name'], group_description=argd['group_description'], join_policy=argd['join_policy'], ln = argd['ln']) else: (body, errors, warnings) = webgroup.perform_request_input_create_group(group_name=argd['group_name'], group_description=argd['group_description'], join_policy=argd['join_policy'], ln=argd['ln']) title = _("Create new group") return page(title = title, body = body, navtrail = webgroup.get_navtrail(argd['ln'], title), uid = uid, req = req, language = argd['ln'], lastupdated = __lastupdated__, errors = errors, warnings = warnings, navmenuid = 'yourgroups') def join(self, req, form): """join(): interface for joining a new group @param grpID: : list of the group the user wants to become a member. The user must select only one group. @param group_name: : will search for groups matching group_name @param *button: which button was pressed @param ln: language @return: the compose page Join group """ argd = wash_urlargd(form, {'grpID':(list, []), 'group_name':(str, ""), 'find_button':(str, ""), 'join_button':(str, ""), 'cancel':(str, "") }) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(argd['ln']) if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../yourgroups/join", navmenuid='yourgroups') user_info = webuser.collect_user_info(req) if not user_info['precached_usegroups']: return webuser.page_not_authorized(req, "../", \ text = _("You are not authorized to use groups.")) if argd['cancel']: url = CFG_SITE_URL + '/yourgroups/display?ln=%s' url %= argd['ln'] redirect_to_url(req, url) if argd['join_button']: search = 0 if argd['group_name']: search = 1 (body, errors, warnings) = webgroup.perform_request_join_group(uid, argd['grpID'], argd['group_name'], search, argd['ln']) else: search = 0 if argd['find_button']: search = 1 (body, errors, warnings) = webgroup.perform_request_input_join_group(uid, argd['group_name'], search, ln=argd['ln']) title = _("Join New Group") return page(title = title, body = body, navtrail = webgroup.get_navtrail(argd['ln'], title), uid = uid, req = req, language = argd['ln'], lastupdated = __lastupdated__, errors = errors, warnings = warnings, navmenuid = 'yourgroups') def leave(self, req, form): """leave(): interface for leaving a group @param grpID: : group the user wants to leave. @param group_name: : name of the group the user wants to leave @param *button: which button was pressed @param confirmed: : the user is first asked to confirm @param ln: language @return: the compose page Leave group """ argd = wash_urlargd(form, {'grpID':(str, ""), 'group_name':(str, ""), 'leave_button':(str, ""), 'cancel':(str, ""), 'confirmed': (int, 0) }) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(argd['ln']) if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../yourgroups/leave", navmenuid='yourgroups') user_info = webuser.collect_user_info(req) if not user_info['precached_usegroups']: return webuser.page_not_authorized(req, "../", \ text = _("You are not authorized to use groups.")) if argd['cancel']: url = CFG_SITE_URL + '/yourgroups/display?ln=%s' url %= argd['ln'] redirect_to_url(req, url) if argd['leave_button']: (body, errors, warnings) = webgroup.perform_request_leave_group(uid, argd['grpID'], argd['confirmed'], argd['ln']) else: (body, errors, warnings) = webgroup.perform_request_input_leave_group(uid=uid, ln=argd['ln']) title = _("Leave Group") return page(title = title, body = body, navtrail = webgroup.get_navtrail(argd['ln'], title), uid = uid, req = req, language = argd['ln'], lastupdated = __lastupdated__, errors = errors, warnings = warnings, navmenuid = 'yourgroups') def edit(self, req, form): """edit(): interface for editing group @param grpID: : group ID @param group_name: : name of the new webgroup.Must be filled @param group_description: : description of the new webgroup.(optionnal) @param join_policy: : join policy of the new webgroup.Must be chosen @param update: button update group pressed @param delete: button delete group pressed @param cancel: button cancel pressed @param confirmed: : the user is first asked to confirm before deleting @param ln: language @return: the main page displaying all the groups """ argd = wash_urlargd(form, {'grpID': (str, ""), 'update': (str, ""), 'cancel': (str, ""), 'delete': (str, ""), 'group_name': (str, ""), 'group_description': (str, ""), 'join_policy': (str, ""), 'confirmed': (int, 0) }) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(argd['ln']) if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../yourgroups/display", navmenuid='yourgroups') user_info = webuser.collect_user_info(req) if not user_info['precached_usegroups']: return webuser.page_not_authorized(req, "../", \ text = _("You are not authorized to use groups.")) if argd['cancel']: url = CFG_SITE_URL + '/yourgroups/display?ln=%s' url %= argd['ln'] redirect_to_url(req, url) elif argd['delete']: (body, errors, warnings) = webgroup.perform_request_delete_group(uid=uid, grpID=argd['grpID'], confirmed=argd['confirmed']) elif argd['update']: (body, errors, warnings) = webgroup.perform_request_update_group(uid= uid, grpID=argd['grpID'], group_name=argd['group_name'], group_description=argd['group_description'], join_policy=argd['join_policy'], ln=argd['ln']) else : (body, errors, warnings)= webgroup.perform_request_edit_group(uid=uid, grpID=argd['grpID'], ln=argd['ln']) title = _("Edit Group") return page(title = title, body = body, navtrail = webgroup.get_navtrail(argd['ln'], title), uid = uid, req = req, language = argd['ln'], lastupdated = __lastupdated__, errors = errors, warnings = warnings, navmenuid = 'yourgroups') def members(self, req, form): """member(): interface for managing members of a group @param grpID: : group ID @param add_member: button add_member pressed @param remove_member: button remove_member pressed @param reject_member: button reject__member pressed @param delete: button delete group pressed @param member_id: : ID of the existing member selected @param pending_member_id: : ID of the pending member selected @param cancel: button cancel pressed @param info: : info about last user action @param ln: language @return: the same page with data updated """ argd = wash_urlargd(form, {'grpID': (int, 0), 'cancel': (str, ""), 'add_member': (str, ""), 'remove_member': (str, ""), 'reject_member': (str, ""), 'member_id': (int, 0), 'pending_member_id': (int, 0) }) uid = webuser.getUid(req) # load the right message language _ = gettext_set_language(argd['ln']) if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return webuser.page_not_authorized(req, "../yourgroups/display", navmenuid='yourgroups') user_info = webuser.collect_user_info(req) if not user_info['precached_usegroups']: return webuser.page_not_authorized(req, "../", \ text = _("You are not authorized to use groups.")) if argd['cancel']: url = CFG_SITE_URL + '/yourgroups/display?ln=%s' url %= argd['ln'] redirect_to_url(req, url) if argd['remove_member']: (body, errors, warnings) = webgroup.perform_request_remove_member(uid=uid, grpID=argd['grpID'], member_id=argd['member_id'], ln=argd['ln']) elif argd['reject_member']: (body, errors, warnings) = webgroup.perform_request_reject_member(uid=uid, grpID=argd['grpID'], user_id=argd['pending_member_id'], ln=argd['ln']) elif argd['add_member']: (body, errors, warnings) = webgroup.perform_request_add_member(uid=uid, grpID=argd['grpID'], user_id=argd['pending_member_id'], ln=argd['ln']) else: (body, errors, warnings)= webgroup.perform_request_manage_member(uid=uid, grpID=argd['grpID'], ln=argd['ln']) title = _("Edit group members") return page(title = title, body = body, navtrail = webgroup.get_navtrail(argd['ln'], title), uid = uid, req = req, language = argd['ln'], lastupdated = __lastupdated__, errors = errors, warnings = warnings, navmenuid = 'yourgroups') def wash_login_method(login_method): """ Wash the login_method parameter that came from the web input form. @param login_method: Wanted login_method value as it came from the web input form. @type login_method: string @return: Washed version of login_method. If the login_method value is valid, then return it. If it is not valid, then return `Local' (the default login method). @rtype: string @warning: Beware, 'Local' is hardcoded here! """ if login_method in CFG_EXTERNAL_AUTHENTICATION.keys(): return login_method else: return 'Local' diff --git a/modules/websession/lib/webuser.py b/modules/websession/lib/webuser.py index 505fb3c2e..342dc3b1c 100644 --- a/modules/websession/lib/webuser.py +++ b/modules/websession/lib/webuser.py @@ -1,1182 +1,1179 @@ # -*- coding: utf-8 -*- ## ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ This file implements all methods necessary for working with users and sessions in CDS Invenio. Contains methods for logging/registration when a user log/register into the system, checking if it is a guest user or not. At the same time this presents all the stuff it could need with sessions managements, working with websession. It also contains Apache-related user authentication stuff. """ __revision__ = "$Id$" -try: - from mod_python import apache -except ImportError: - pass +from invenio import webinterface_handler_wsgi_utils as apache import cgi import urllib from socket import gethostbyname, gaierror import os import crypt import socket import smtplib import re import random import datetime import base64 from invenio.config import \ CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS, \ CFG_ACCESS_CONTROL_LEVEL_GUESTS, \ CFG_ACCESS_CONTROL_LEVEL_SITE, \ CFG_ACCESS_CONTROL_LIMIT_REGISTRATION_TO_DOMAIN, \ CFG_ACCESS_CONTROL_NOTIFY_ADMIN_ABOUT_NEW_ACCOUNTS, \ CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT, \ CFG_APACHE_GROUP_FILE, \ CFG_APACHE_PASSWORD_FILE, \ CFG_SITE_ADMIN_EMAIL, \ CFG_SITE_LANG, \ CFG_SITE_NAME, \ CFG_SITE_NAME_INTL, \ CFG_SITE_SUPPORT_EMAIL, \ CFG_SITE_SECURE_URL, \ CFG_TMPDIR, \ CFG_SITE_URL, \ CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS, \ CFG_CERN_SITE, \ CFG_WEBSEARCH_PERMITTED_RESTRICTED_COLLECTIONS_LEVEL try: from invenio.session import get_session except ImportError: pass from invenio.dbquery import run_sql, OperationalError, \ serialize_via_marshal, deserialize_via_marshal from invenio.access_control_admin import acc_get_role_id, acc_get_action_roles, acc_get_action_id, acc_is_user_in_role, acc_find_possible_activities from invenio.access_control_mailcookie import mail_cookie_create_mail_activation from invenio.access_control_firerole import acc_firerole_check_user, load_role_definition from invenio.access_control_config import SUPERADMINROLE, CFG_EXTERNAL_AUTH_USING_SSO from invenio.messages import gettext_set_language, wash_languages, wash_language from invenio.mailutils import send_email from invenio.errorlib import register_exception from invenio.webgroup_dblayer import get_groups from invenio.external_authentication import InvenioWebAccessExternalAuthError from invenio.access_control_config import CFG_EXTERNAL_AUTHENTICATION, \ CFG_WEBACCESS_MSGS, CFG_WEBACCESS_WARNING_MSGS import invenio.template tmpl = invenio.template.load('websession') re_invalid_nickname = re.compile(""".*[,'@]+.*""") # pylint: disable-msg=C0301 def createGuestUser(): """Create a guest user , insert into user null values in all fields createGuestUser() -> GuestUserID """ if CFG_ACCESS_CONTROL_LEVEL_GUESTS == 0: try: return run_sql("insert into user (email, note) values ('', '1')") except OperationalError: return None else: try: return run_sql("insert into user (email, note) values ('', '0')") except OperationalError: return None def page_not_authorized(req, referer='', uid='', text='', navtrail='', ln=CFG_SITE_LANG, navmenuid=""): """Show error message when user is not authorized to do something. @param referer: in case the displayed message propose a login link, this is the url to return to after logging in. If not specified it is guessed from req. @param uid: the uid of the user. If not specified it is guessed from req. @param text: the message to be displayed. If not specified it will be guessed from the context. """ from invenio.webpage import page _ = gettext_set_language(ln) if not referer: referer = req.unparsed_uri if not CFG_ACCESS_CONTROL_LEVEL_SITE: title = CFG_WEBACCESS_MSGS[5] if not uid: uid = getUid(req) try: res = run_sql("SELECT email FROM user WHERE id=%s AND note=1" % uid) if res and res[0][0]: if text: body = text else: body = "%s %s" % (CFG_WEBACCESS_WARNING_MSGS[9] % cgi.escape(res[0][0]), ("%s %s" % (CFG_WEBACCESS_MSGS[0] % urllib.quote(referer), CFG_WEBACCESS_MSGS[1]))) else: if text: body = text else: if CFG_ACCESS_CONTROL_LEVEL_GUESTS == 1: body = CFG_WEBACCESS_MSGS[3] else: body = CFG_WEBACCESS_WARNING_MSGS[4] + CFG_WEBACCESS_MSGS[2] except OperationalError, e: body = _("Database problem") + ': ' + str(e) elif CFG_ACCESS_CONTROL_LEVEL_SITE == 1: title = CFG_WEBACCESS_MSGS[8] body = "%s %s" % (CFG_WEBACCESS_MSGS[7], CFG_WEBACCESS_MSGS[2]) elif CFG_ACCESS_CONTROL_LEVEL_SITE == 2: title = CFG_WEBACCESS_MSGS[6] body = "%s %s" % (CFG_WEBACCESS_MSGS[4], CFG_WEBACCESS_MSGS[2]) return page(title=title, language=ln, uid=getUid(req), body=body, navtrail=navtrail, req=req, navmenuid=navmenuid) def getApacheUser(req): """Return the ApacheUser taking it from the cookie of the request.""" session = get_session(req) return session.get('apache_user') def getUid(req): """Return user ID taking it from the cookie of the request. Includes control mechanism for the guest users, inserting in the database table when need be, raising the cookie back to the client. User ID is set to 0 when client refuses cookie or we are in the read-only site operation mode. User ID is set to -1 when we are in the permission denied site operation mode. getUid(req) -> userId """ if hasattr(req, '_user_info'): return req._user_info['uid'] if CFG_ACCESS_CONTROL_LEVEL_SITE == 1: return 0 if CFG_ACCESS_CONTROL_LEVEL_SITE == 2: return -1 guest = 0 session = get_session(req) uid = session.get('uid', -1) if uid == -1: # first time, so create a guest user if CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: uid = session['uid'] = createGuestUser() guest = 1 else: if CFG_ACCESS_CONTROL_LEVEL_GUESTS == 0: session['uid'] = 0 return 0 else: return -1 else: if not hasattr(req, '_user_info') and 'user_info' in session: req._user_info = session['user_info'] req._user_info = collect_user_info(req, refresh=True) if guest == 0: guest = isGuestUser(uid) if guest: if CFG_ACCESS_CONTROL_LEVEL_GUESTS == 0: return uid elif CFG_ACCESS_CONTROL_LEVEL_GUESTS >= 1: return -1 else: res = run_sql("SELECT note FROM user WHERE id=%s", (uid, )) if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 0: return uid elif CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 1 and res and res[0][0] in [1, "1"]: return uid else: return -1 def setApacheUser(req, apache_user): """It sets the apache_user into the session, and raise the cookie to the client. """ if hasattr(req, '_user_info'): del req._user_info session = get_session(req) session['apache_user'] = apache_user user_info = collect_user_info(req, login_time=True) session['user_info'] = user_info req._user_info = user_info session.save() return apache_user def setUid(req, uid, remember_me=False): """It sets the userId into the session, and raise the cookie to the client. """ if hasattr(req, '_user_info'): del req._user_info session = get_session(req) session['uid'] = uid if remember_me: session.set_timeout(86400) session.set_remember_me() if uid > 0: user_info = collect_user_info(req, login_time=True) session['user_info'] = user_info req._user_info = user_info else: del session['user_info'] session.save() return uid def session_param_del(req, key): """ Remove a given key from the session. """ session = get_session(req) del session[key] session.save() def session_param_set(req, key, value): """ Associate a VALUE to the session param KEY for the current session. """ session = get_session(req) session[key] = value session.save() def session_param_get(req, key): """ Return session parameter value associated with session parameter KEY for the current session. If the key doesn't exists raise KeyError. """ session = get_session(req) return session[key] def session_param_list(req): """ List all available session parameters. """ session = get_session(req) return session.keys() def get_last_login(uid): """Return the last_login datetime for uid if any, otherwise return the Epoch.""" res = run_sql('SELECT last_login FROM user WHERE id=%s', (uid, ), 1) if res and res[0][0]: return res[0][0] else: return datetime.datetime(1970, 1, 1) def get_user_info(uid, ln=CFG_SITE_LANG): """Get infos for a given user. @param uid: user id (int) @return: tuple: (uid, nickname, display_name) """ _ = gettext_set_language(ln) query = """SELECT id, nickname FROM user WHERE id=%s""" res = run_sql(query, (uid, )) if res: if res[0]: user = list(res[0]) if user[1]: user.append(user[1]) else: user[1] = str(user[0]) user.append(_("user") + ' #' + str(user[0])) return tuple(user) return (uid, '', _("N/A")) def get_uid_from_email(email): """Return the uid corresponding to an email. Return -1 when the email does not exists.""" try: res = run_sql("SELECT id FROM user WHERE email=%s", (email, )) if res: return res[0][0] else: return -1 except OperationalError: register_exception() return -1 def isGuestUser(uid): """It Checks if the userId corresponds to a guestUser or not isGuestUser(uid) -> boolean """ out = 1 try: res = run_sql("SELECT email FROM user WHERE id=%s LIMIT 1", (uid,), 1) if res: if res[0][0]: out = 0 except OperationalError: register_exception() return out def isUserSubmitter(user_info): """Return True if the user is a submitter for something; False otherwise.""" u_email = get_email(user_info['uid']) res = run_sql("SELECT email FROM sbmSUBMISSIONS WHERE email=%s LIMIT 1", (u_email,), 1) return len(res) > 0 def isUserReferee(user_info): """Return True if the user is a referee for something; False otherwise.""" if CFG_CERN_SITE: return True else: for (role_id, role_name, role_description) in acc_get_action_roles(acc_get_action_id('referee')): if acc_is_user_in_role(user_info, role_id): return True return False def isUserAdmin(user_info): """Return True if the user has some admin rights; False otherwise.""" return acc_find_possible_activities(user_info) != {} def isUserSuperAdmin(user_info): """Return True if the user is superadmin; False otherwise.""" if run_sql("""SELECT r.id FROM accROLE r LEFT JOIN user_accROLE ur ON r.id = ur.id_accROLE WHERE r.name = %s AND ur.id_user = %s AND ur.expiration>=NOW() LIMIT 1""", (SUPERADMINROLE, user_info['uid']), 1): return True return acc_firerole_check_user(user_info, load_role_definition(acc_get_role_id(SUPERADMINROLE))) def nickname_valid_p(nickname): """Check whether wanted NICKNAME supplied by the user is valid. At the moment we just check whether it is not empty, does not contain blanks or @, is not equal to `guest', etc. This check relies on re_invalid_nickname regexp (see above) Return 1 if nickname is okay, return 0 if it is not. """ if nickname and \ not(nickname.startswith(' ') or nickname.endswith(' ')) and \ nickname.lower() != 'guest': if not re_invalid_nickname.match(nickname): return 1 return 0 def email_valid_p(email): """Check whether wanted EMAIL address supplied by the user is valid. At the moment we just check whether it contains '@' and whether it doesn't contain blanks. We also check the email domain if CFG_ACCESS_CONTROL_LIMIT_REGISTRATION_TO_DOMAIN is set. Return 1 if email is okay, return 0 if it is not. """ if (email.find("@") <= 0) or (email.find(" ") > 0): return 0 elif CFG_ACCESS_CONTROL_LIMIT_REGISTRATION_TO_DOMAIN: if not email.endswith(CFG_ACCESS_CONTROL_LIMIT_REGISTRATION_TO_DOMAIN): return 0 return 1 def confirm_email(email): """Confirm the email. It returns None when there are problems, otherwise it return the uid involved.""" if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 0: activated = 1 elif CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 1: activated = 0 elif CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 2: return -1 run_sql('UPDATE user SET note=%s where email=%s', (activated, email)) res = run_sql('SELECT id FROM user where email=%s', (email, )) if res: if CFG_ACCESS_CONTROL_NOTIFY_ADMIN_ABOUT_NEW_ACCOUNTS: send_new_admin_account_warning(email, CFG_SITE_ADMIN_EMAIL) return res[0][0] else: return None def registerUser(req, email, passw, nickname, register_without_nickname=False, login_method=None, ln=CFG_SITE_LANG): """Register user with the desired values of NICKNAME, EMAIL and PASSW. If REGISTER_WITHOUT_NICKNAME is set to True, then ignore desired NICKNAME and do not set any. This is suitable for external authentications so that people can login without having to register an internal account first. Return 0 if the registration is successful, 1 if email is not valid, 2 if nickname is not valid, 3 if email is already in the database, 4 if nickname is already in the database, 5 when users cannot register themselves because of the site policy, 6 when the site is having problem contacting the user. If login_method is None or is equal to the key corresponding to local authentication, then CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS is taken in account for deciding the behaviour about registering. """ # is email valid? email = email.lower() if not email_valid_p(email): return 1 _ = gettext_set_language(ln) # is email already taken? res = run_sql("SELECT email FROM user WHERE email=%s", (email,)) if len(res) > 0: return 3 if register_without_nickname: # ignore desired nick and use default empty string one: nickname = "" else: # is nickname valid? if not nickname_valid_p(nickname): return 2 # is nickname already taken? res = run_sql("SELECT nickname FROM user WHERE nickname=%s", (nickname,)) if len(res) > 0: return 4 activated = 1 # By default activated if not login_method or not CFG_EXTERNAL_AUTHENTICATION[login_method][0]: # local login if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 2: return 5 elif CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT: activated = 2 # Email confirmation required elif CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 1: activated = 0 # Administrator confirmation required if CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT: address_activation_key = mail_cookie_create_mail_activation(email) - ip_address = req.connection.remote_host or req.connection.remote_ip + ip_address = req.remote_host or req.remote_ip try: if not send_email(CFG_SITE_SUPPORT_EMAIL, email, _("Account registration at %s") % CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME), tmpl.tmpl_account_address_activation_email_body(email, address_activation_key, ip_address, ln)): return 1 except (smtplib.SMTPException, socket.error): return 6 # okay, go on and register the user: user_preference = get_default_user_preferences() uid = run_sql("INSERT INTO user (nickname, email, password, note, settings, last_login) " "VALUES (%s,%s,AES_ENCRYPT(email,%s),%s,%s, NOW())", (nickname, email, passw, activated, serialize_via_marshal(user_preference))) if activated == 1: # Ok we consider the user as logged in :-) setUid(req, uid) return 0 def updateDataUser(uid, email, nickname): """ Update user data. Used when a user changed his email or password or nickname. """ email = email.lower() if email == 'guest': return 0 if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS < 2: run_sql("update user set email=%s where id=%s", (email, uid)) if nickname and nickname != '': run_sql("update user set nickname=%s where id=%s", (nickname, uid)) return 1 def updatePasswordUser(uid, password): """Update the password of a user.""" if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS < 3: run_sql("update user set password=AES_ENCRYPT(email,%s) where id=%s", (password, uid)) return 1 def loginUser(req, p_un, p_pw, login_method): """It is a first simple version for the authentication of user. It returns the id of the user, for checking afterwards if the login is correct """ # p_un passed may be an email or a nickname: p_email = get_email_from_username(p_un) # go on with the old stuff based on p_email: if not CFG_EXTERNAL_AUTHENTICATION.has_key(login_method): return ([], p_email, p_pw, 12) if CFG_EXTERNAL_AUTHENTICATION[login_method][0]: # External Authenthication try: p_email = CFG_EXTERNAL_AUTHENTICATION[login_method][0].auth_user(p_email, p_pw, req) or CFG_EXTERNAL_AUTHENTICATION[login_method][0].auth_user(p_un, p_pw, req) ## We try to login with either the email of the nickname if p_email: p_email = p_email.lower() else: return([], p_email, p_pw, 15) except InvenioWebAccessExternalAuthError: register_exception(alert_admin=True) raise if p_email: # Authenthicated externally query_result = run_sql("SELECT id from user where email=%s", (p_email,)) if not query_result: # First time user p_pw_local = int(random.random() * 1000000) p_nickname = '' if CFG_EXTERNAL_AUTHENTICATION[login_method][0].enforce_external_nicknames: try: # Let's discover the external nickname! p_nickname = CFG_EXTERNAL_AUTHENTICATION[login_method][0].fetch_user_nickname(p_email, p_pw, req) except (AttributeError, NotImplementedError): pass except: register_exception(alert_admin=True) raise res = registerUser(req, p_email, p_pw_local, p_nickname, register_without_nickname=p_nickname == '', login_method=login_method) if res == 4 or res == 2: # The nickname was already taken res = registerUser(req, p_email, p_pw_local, '', register_without_nickname=True, login_method=login_method) elif res == 0: # Everything was ok, with or without nickname. query_result = run_sql("SELECT id from user where email=%s", (p_email,)) elif res == 6: # error in contacting the user via email return([], p_email, p_pw_local, 19) else: return([], p_email, p_pw_local, 13) try: groups = CFG_EXTERNAL_AUTHENTICATION[login_method][0].fetch_user_groups_membership(p_email, p_pw, req) # groups is a dictionary {group_name : group_description,} new_groups = {} for key, value in groups.items(): new_groups[key + " [" + str(login_method) + "]"] = value groups = new_groups except (AttributeError, NotImplementedError): pass except: register_exception(alert_admin=True) return([], p_email, p_pw, 16) else: # Groups synchronization if groups != 0: userid = query_result[0][0] from invenio.webgroup import synchronize_external_groups synchronize_external_groups(userid, groups, login_method) user_prefs = get_user_preferences(query_result[0][0]) if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 4: # Let's prevent the user to switch login_method if user_prefs.has_key("login_method") and \ user_prefs["login_method"] != login_method: return([], p_email, p_pw, 11) user_prefs["login_method"] = login_method # Cleaning external settings for key in user_prefs.keys(): if key.startswith('EXTERNAL_'): del user_prefs[key] try: # Importing external settings new_prefs = CFG_EXTERNAL_AUTHENTICATION[login_method][0].fetch_user_preferences(p_email, p_pw, req) for key, value in new_prefs.items(): user_prefs['EXTERNAL_' + key] = value except (AttributeError, NotImplementedError): pass except InvenioWebAccessExternalAuthError: register_exception(alert_admin=True) return([], p_email, p_pw, 16) # Storing settings set_user_preferences(query_result[0][0], user_prefs) else: return ([], p_un, p_pw, 10) else: # Internal Authenthication if not p_pw: p_pw = '' query_result = run_sql("SELECT id,email,note from user where email=%s and password=AES_ENCRYPT(email,%s)", (p_email, p_pw,)) if query_result: #FIXME drop external groups and settings note = query_result[0][2] if note == '1': # Good account preferred_login_method = get_user_preferences(query_result[0][0])['login_method'] p_email = query_result[0][1].lower() if login_method != preferred_login_method: if CFG_EXTERNAL_AUTHENTICATION.has_key(preferred_login_method): return ([], p_email, p_pw, 11) elif note == '2': # Email address need to be confirmed by user return ([], p_email, p_pw, 17) elif note == '0': # Account need to be confirmed by administrator return ([], p_email, p_pw, 18) else: return ([], p_email, p_pw, 14) # Login successful! Updating the last access time run_sql("UPDATE user SET last_login=NOW() WHERE email=%s", (p_email, )) return (query_result, p_email, p_pw, 0) def drop_external_settings(userId): """Drop the external (EXTERNAL_) settings of userid.""" prefs = get_user_preferences(userId) for key in prefs.keys(): if key.startswith('EXTERNAL_'): del prefs[key] set_user_preferences(userId, prefs) def logoutUser(req): """It logout the user of the system, creating a guest user. """ session = get_session(req) if CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS: uid = createGuestUser() session['uid'] = uid session.save() else: uid = 0 session.invalidate() if hasattr(req, '_user_info'): delattr(req, '_user_info') return uid def username_exists_p(username): """Check if USERNAME exists in the system. Username may be either nickname or email. Return 1 if it does exist, 0 if it does not. """ if username == "": # return not exists if asked for guest users return 0 res = run_sql("SELECT email FROM user WHERE email=%s", (username,)) + \ run_sql("SELECT email FROM user WHERE nickname=%s", (username,)) if len(res) > 0: return 1 return 0 def emailUnique(p_email): """Check if the email address only exists once. If yes, return userid, if not, -1 """ query_result = run_sql("select id, email from user where email=%s", (p_email,)) if len(query_result) == 1: return query_result[0][0] elif len(query_result) == 0: return 0 return -1 def nicknameUnique(p_nickname): """Check if the nickname only exists once. If yes, return userid, if not, -1 """ query_result = run_sql("select id, nickname from user where nickname=%s", (p_nickname,)) if len(query_result) == 1: return query_result[0][0] elif len(query_result) == 0: return 0 return -1 def update_Uid(req, p_email, remember_me=False): """It updates the userId of the session. It is used when a guest user is logged in succesfully in the system with a given email and password. As a side effect it will discover all the restricted collection to which the user has right to """ query_ID = int(run_sql("select id from user where email=%s", (p_email,))[0][0]) setUid(req, query_ID, remember_me) return query_ID def send_new_admin_account_warning(new_account_email, send_to, ln=CFG_SITE_LANG): """Send an email to the address given by send_to about the new account new_account_email.""" _ = gettext_set_language(ln) sub = _("New account on") + " '%s'" % CFG_SITE_NAME if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 1: sub += " - " + _("PLEASE ACTIVATE") body = _("A new account has been created on") + " '%s'" % CFG_SITE_NAME if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 1: body += _(" and is awaiting activation") body += ":\n\n" body += _(" Username/Email") + ": %s\n\n" % new_account_email body += _("You can approve or reject this account request at") + ": %s/admin/webaccess/webaccessadmin.py/manageaccounts\n" % CFG_SITE_URL return send_email(CFG_SITE_SUPPORT_EMAIL, send_to, subject=sub, content=body) def get_email(uid): """Return email address of the user uid. Return string 'guest' in case the user is not found.""" out = "guest" res = run_sql("SELECT email FROM user WHERE id=%s", (uid,), 1) if res and res[0][0]: out = res[0][0].lower() return out def get_email_from_username(username): """Return email address of the user corresponding to USERNAME. The username may be either nickname or email. Return USERNAME untouched if not found in the database or if found several matching entries. """ if username == '': return '' out = username res = run_sql("SELECT email FROM user WHERE email=%s", (username,), 1) + \ run_sql("SELECT email FROM user WHERE nickname=%s", (username,), 1) if res and len(res) == 1: out = res[0][0].lower() return out #def get_password(uid): #"""Return password of the user uid. Return None in case #the user is not found.""" #out = None #res = run_sql("SELECT password FROM user WHERE id=%s", (uid,), 1) #if res and res[0][0] != None: #out = res[0][0] #return out def get_nickname(uid): """Return nickname of the user uid. Return None in case the user is not found.""" out = None res = run_sql("SELECT nickname FROM user WHERE id=%s", (uid,), 1) if res and res[0][0]: out = res[0][0] return out def get_nickname_or_email(uid): """Return nickname (preferred) or the email address of the user uid. Return string 'guest' in case the user is not found.""" out = "guest" res = run_sql("SELECT nickname, email FROM user WHERE id=%s", (uid,), 1) if res and res[0]: if res[0][0]: out = res[0][0] elif res[0][1]: out = res[0][1].lower() return out def create_userinfobox_body(req, uid, language="en"): """Create user info box body for user UID in language LANGUAGE.""" if req: if req.subprocess_env.has_key('HTTPS') \ and req.subprocess_env['HTTPS'] == 'on': url_referer = CFG_SITE_SECURE_URL + req.unparsed_uri else: url_referer = CFG_SITE_URL + req.unparsed_uri if '/youraccount/logout' in url_referer: url_referer = '' else: url_referer = CFG_SITE_URL user_info = collect_user_info(req) try: return tmpl.tmpl_create_userinfobox(ln=language, url_referer=url_referer, guest = isGuestUser(uid), username = get_nickname_or_email(uid), submitter = user_info['precached_viewsubmissions'], referee = user_info['precached_useapprove'], admin = user_info['precached_useadmin'], usebaskets = user_info['precached_usebaskets'], usemessages = user_info['precached_usemessages'], usealerts = user_info['precached_usealerts'], usegroups = user_info['precached_usegroups'], useloans = user_info['precached_useloans'], usestats = user_info['precached_usestats'] ) except OperationalError: return "" def list_registered_users(): """List all registered users.""" return run_sql("SELECT id,email FROM user where email!=''") def list_users_in_role(role): """List all users of a given role (see table accROLE) @param role: role of user (string) @return: list of uids """ res = run_sql("""SELECT uacc.id_user FROM user_accROLE uacc JOIN accROLE acc ON uacc.id_accROLE=acc.id WHERE acc.name=%s""", (role,)) if res: return map(lambda x: int(x[0]), res) return [] def list_users_in_roles(role_list): """List all users of given roles (see table accROLE) @param role_list: list of roles [string] @return: list of uids """ if not(type(role_list) is list or type(role_list) is tuple): role_list = [role_list] query = """SELECT DISTINCT(uacc.id_user) FROM user_accROLE uacc JOIN accROLE acc ON uacc.id_accROLE=acc.id """ query_addons = "" query_params = () if len(role_list) > 0: query_params = role_list query_addons = " WHERE " for role in role_list[:-1]: query_addons += "acc.name=%s OR " query_addons += "acc.name=%s" res = run_sql(query + query_addons, query_params) if res: return map(lambda x: int(x[0]), res) return [] def get_uid_based_on_pref(prefname, prefvalue): """get the user's UID based where his/her preference prefname has value prefvalue in preferences""" prefs = run_sql("SELECT id, settings FROM user WHERE settings is not NULL") the_uid = None for pref in prefs: try: settings = deserialize_via_marshal(pref[1]) if (settings.has_key(prefname)) and (settings[prefname] == prefvalue): the_uid = pref[0] except: pass return the_uid def get_user_preferences(uid): pref = run_sql("SELECT id, settings FROM user WHERE id=%s", (uid,)) if pref: try: return deserialize_via_marshal(pref[0][1]) except: pass return get_default_user_preferences() # empty dict mean no preferences def set_user_preferences(uid, pref): assert(type(pref) == type({})) run_sql("UPDATE user SET settings=%s WHERE id=%s", (serialize_via_marshal(pref), uid)) def get_default_user_preferences(): user_preference = { 'login_method': ''} for system in CFG_EXTERNAL_AUTHENTICATION.keys(): if CFG_EXTERNAL_AUTHENTICATION[system][1]: user_preference['login_method'] = system break return user_preference def get_preferred_user_language(req): def _get_language_from_req_header(accept_language_header): """Extract langs info from req.headers_in['Accept-Language'] which should be set to something similar to: 'fr,en-us;q=0.7,en;q=0.3' """ tmp_langs = {} for lang in accept_language_header.split(','): lang = lang.split(';q=') if len(lang) == 2: lang[1] = lang[1].replace('"', '') # Hack for Yeti robot try: tmp_langs[float(lang[1])] = lang[0] except ValueError: pass else: tmp_langs[1.0] = lang[0] ret = [] priorities = tmp_langs.keys() priorities.sort() priorities.reverse() for priority in priorities: ret.append(tmp_langs[priority]) return ret uid = getUid(req) guest = isGuestUser(uid) new_lang = None preferred_lang = None if not guest: user_preferences = get_user_preferences(uid) preferred_lang = new_lang = user_preferences.get('language', None) if not new_lang: try: new_lang = wash_languages(cgi.parse_qs(req.args)['ln']) except (TypeError, AttributeError, KeyError): pass if not new_lang: try: new_lang = wash_languages(_get_language_from_req_header(req.headers_in['Accept-Language'])) except (TypeError, AttributeError, KeyError): pass new_lang = wash_language(new_lang) if new_lang != preferred_lang and not guest: user_preferences['language'] = new_lang set_user_preferences(uid, user_preferences) return new_lang def collect_user_info(req, login_time=False, refresh=False): """Given the mod_python request object rec or a uid it returns a dictionary containing at least the keys uid, nickname, email, groups, plus any external keys in the user preferences (collected at login time and built by the different external authentication plugins) and if the mod_python request object is provided, also the remote_ip, remote_host, referer, agent fields. If the user is authenticated with Apache should provide also apache_user and apache_group. NOTE: if req is a mod_python request object, the user_info dictionary is saved into req._user_info (for caching purpouses) setApacheUser & setUid will properly reset it. """ from invenio.search_engine import get_permitted_restricted_collections user_info = { 'remote_ip' : '', 'remote_host' : '', 'referer' : '', 'uri' : '', 'agent' : '', 'apache_user' : '', 'apache_group' : [], 'uid' : -1, 'nickname' : '', 'email' : '', 'group' : [], 'guest' : '1', 'session' : None, 'precached_permitted_restricted_collections' : [], 'precached_usebaskets' : False, 'precached_useloans' : False, 'precached_usegroups' : False, 'precached_usealerts' : False, 'precached_usemessages' : False, 'precached_viewsubmissions' : False, 'precached_useapprove' : False, 'precached_useadmin' : False, 'precached_usestats' : False, } try: is_req = False if req is None: uid = -1 elif type(req) in (type(1), type(1L)): ## req is infact a user identification uid = req elif type(req) is dict: ## req is by mistake already a user_info try: assert(req.has_key('uid')) assert(req.has_key('email')) assert(req.has_key('nickname')) except AssertionError: ## mmh... misuse of collect_user_info. Better warn the admin! register_exception(alert_admin=True) user_info.update(req) return user_info else: is_req = True uid = getUid(req) if hasattr(req, '_user_info') and not login_time: user_info = req._user_info if not refresh: return req._user_info req._user_info = user_info try: - user_info['remote_ip'] = gethostbyname(req.connection.remote_ip) + user_info['remote_ip'] = req.remote_ip except gaierror: #FIXME: we should support IPV6 too. (hint for FireRole) pass user_info['session'] = get_session(req).sid() - user_info['remote_host'] = req.connection.remote_host or '' + user_info['remote_host'] = req.remote_host or '' user_info['referer'] = req.headers_in.get('Referer', '') user_info['uri'] = req.unparsed_uri or () user_info['agent'] = req.headers_in.get('User-Agent', 'N/A') try: user_info['apache_user'] = getApacheUser(req) if user_info['apache_user']: user_info['apache_group'] = auth_apache_user_in_groups(user_info['apache_user']) except AttributeError: pass user_info['uid'] = uid user_info['nickname'] = get_nickname(uid) or '' user_info['email'] = get_email(uid) or '' user_info['group'] = [] user_info['guest'] = str(isGuestUser(uid)) if user_info['guest'] == '0': user_info['group'] = [group[1] for group in get_groups(uid)] prefs = get_user_preferences(uid) login_method = prefs['login_method'] login_object = CFG_EXTERNAL_AUTHENTICATION[login_method][0] if login_object and ((datetime.datetime.now() - get_last_login(uid)).seconds > 3600): ## The user uses an external authentication method and it's a bit since ## she has not performed a login if not CFG_EXTERNAL_AUTH_USING_SSO or ( is_req and req.is_https()): ## If we're using SSO we must be sure to be in HTTPS ## otherwise we can't really read anything, hence ## it's better skeep the synchronization try: groups = login_object.fetch_user_groups_membership(user_info['email'], req=req) # groups is a dictionary {group_name : group_description,} new_groups = {} for key, value in groups.items(): new_groups[key + " [" + str(login_method) + "]"] = value groups = new_groups except (AttributeError, NotImplementedError, TypeError, InvenioWebAccessExternalAuthError): pass else: # Groups synchronization from invenio.webgroup import synchronize_external_groups synchronize_external_groups(uid, groups, login_method) user_info['group'] = [group[1] for group in get_groups(uid)] try: # Importing external settings new_prefs = login_object.fetch_user_preferences(user_info['email'], req=req) for key, value in new_prefs.items(): prefs['EXTERNAL_' + key] = value except (AttributeError, NotImplementedError, TypeError, InvenioWebAccessExternalAuthError): pass else: set_user_preferences(uid, prefs) prefs = get_user_preferences(uid) run_sql('UPDATE user SET last_login=NOW() WHERE id=%s', (uid, )) if prefs: for key, value in prefs.iteritems(): user_info[key.lower()] = value if login_time: ## Heavy computational information from invenio.access_control_engine import acc_authorize_action if CFG_WEBSEARCH_PERMITTED_RESTRICTED_COLLECTIONS_LEVEL > 0: user_info['precached_permitted_restricted_collections'] = get_permitted_restricted_collections(user_info) user_info['precached_usebaskets'] = acc_authorize_action(user_info, 'usebaskets')[0] == 0 user_info['precached_useloans'] = acc_authorize_action(user_info, 'useloans')[0] == 0 user_info['precached_usegroups'] = acc_authorize_action(user_info, 'usegroups')[0] == 0 user_info['precached_usealerts'] = acc_authorize_action(user_info, 'usealerts')[0] == 0 user_info['precached_usemessages'] = acc_authorize_action(user_info, 'usemessages')[0] == 0 user_info['precached_usestats'] = acc_authorize_action(user_info, 'runwebstatadmin')[0] == 0 user_info['precached_viewsubmissions'] = isUserSubmitter(user_info) user_info['precached_useapprove'] = isUserReferee(user_info) user_info['precached_useadmin'] = isUserAdmin(user_info) except Exception, e: register_exception() return user_info ## --- follow some functions for Apache user/group authentication def _load_apache_password_file(apache_password_file=CFG_APACHE_PASSWORD_FILE): ret = {} for row in open(os.path.join(CFG_TMPDIR, apache_password_file)): row = row.split(':') if len(row) == 2: ret[row[0].strip()] = row[1].strip() return ret _apache_passwords = _load_apache_password_file() def auth_apache_user_p(user, password): """Check whether user-supplied credentials correspond to valid Apache password data file.""" if user in _apache_passwords: password_apache = _apache_passwords[user] salt = password_apache[:2] return crypt.crypt(password, salt) == password_apache else: return False def _load_apache_group_file(apache_group_file=CFG_APACHE_GROUP_FILE): ret = {} for row in open(os.path.join(CFG_TMPDIR, apache_group_file)): row = row.split(':') if len(row) == 2: group = row[0].strip() users = row[1].strip().split(' ') for user in users: user = user.strip() if user not in ret: ret[user] = [] ret[user].append(group) return ret _apache_groups = _load_apache_group_file() def auth_apache_user_in_groups(user): """Return list of Apache groups to which Apache user belong.""" return _apache_groups.get(user, []) def http_get_credentials(req): if req.headers_in.has_key("Authorization"): try: s = req.headers_in["Authorization"][6:] s = base64.decodestring(s) user, passwd = s.split(":", 1) except (ValueError, base64.binascii.Error, base64.binascii.Incomplete): raise apache.SERVER_RETURN, apache.HTTP_BAD_REQUEST return (user, passwd) return (None, None) def http_check_credentials(req, role): """Retrieve Apache password and check user credential with the check_auth function. If this function returns True check if the user is enabled to the given role. If this is True, return, otherwise popup a new apache login box. """ authorized = False while True: if req.headers_in.has_key("Authorization"): try: s = req.headers_in["Authorization"][6:] s = base64.decodestring(s) user, passwd = s.split(":", 1) except (ValueError, base64.binascii.Error, base64.binascii.Incomplete): raise apache.SERVER_RETURN, apache.HTTP_BAD_REQUEST authorized = auth_apache_user_p(user, passwd) if authorized: setApacheUser(req, user) authorized = acc_firerole_check_user(collect_user_info(req), load_role_definition(acc_get_role_id(role))) setApacheUser(req, '') if not authorized: # note that Opera supposedly doesn't like spaces around "=" below s = 'Basic realm="%s"' % role - req.err_headers_out["WWW-Authenticate"] = s + req.headers_out["WWW-Authenticate"] = s raise apache.SERVER_RETURN, apache.HTTP_UNAUTHORIZED else: setApacheUser(req, user) return diff --git a/modules/webstat/lib/webstat_webinterface.py b/modules/webstat/lib/webstat_webinterface.py index fa649fbe9..99a021934 100644 --- a/modules/webstat/lib/webstat_webinterface.py +++ b/modules/webstat/lib/webstat_webinterface.py @@ -1,304 +1,304 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. __revision__ = "$Id$" __lastupdated__ = "$Date$" import os from urllib import unquote -from mod_python import apache +from invenio import webinterface_handler_wsgi_utils as apache from invenio.config import \ CFG_TMPDIR, \ CFG_SITE_URL, \ CFG_SITE_NAME, \ CFG_SITE_LANG from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory from invenio.webpage import page from invenio import template from invenio.access_control_engine import acc_authorize_action from invenio.webuser import collect_user_info, page_not_authorized from invenio.urlutils import redirect_to_url, make_canonical_urlargd from invenio.webstat import perform_request_index from invenio.webstat import perform_display_keyevent from invenio.webstat import perform_display_customevent from invenio.webstat import perform_display_customevent_help from invenio.webstat import register_customevent def detect_suitable_graph_format(): """ Return suitable graph format default argument: gnuplot if it is present, otherwise asciiart. """ try: import Gnuplot suitable_graph_format = "gnuplot" except ImportError: suitable_graph_format = "asciiart" return suitable_graph_format SUITABLE_GRAPH_FORMAT = detect_suitable_graph_format() class WebInterfaceStatsPages(WebInterfaceDirectory): """Defines the set of stats pages.""" _exports = [ '', 'collection_population', 'search_frequency', 'search_type_distribution', 'download_frequency', 'customevent', 'customevent_help', 'customevent_register', 'export' ] navtrail = """<a class="navtrail" href="%s/stats/%%(ln_link)s">Statistics</a>""" % CFG_SITE_URL def __call__(self, req, form): """Index page.""" argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)}) ln = argd['ln'] user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='index', ln=ln) return page(title="Statistics", body=perform_request_index(ln=ln), description="CDS, Statistics", keywords="CDS, statistics", req=req, lastupdated=__lastupdated__, navmenuid='stats', language=ln) # KEY EVENT SECTION def collection_population(self, req, form): """Collection population statistics page.""" argd = wash_urlargd(form, {'collection': (str, CFG_SITE_NAME), 'timespan': (str, "today"), 'format': (str, SUITABLE_GRAPH_FORMAT), 'ln': (str, CFG_SITE_LANG)}) ln = argd['ln'] user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='collection population', ln=ln) return page(title="Collection population", body=perform_display_keyevent('collection population', argd, req, ln=ln), navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \ (CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln='+ln) or ''), description="CDS, Statistics, Collection population", keywords="CDS, statistics, collection population", req=req, lastupdated=__lastupdated__, navmenuid='collection population', language=ln) def search_frequency(self, req, form): """Search frequency statistics page.""" argd = wash_urlargd(form, {'timespan': (str, "today"), 'format': (str, SUITABLE_GRAPH_FORMAT), 'ln': (str, CFG_SITE_LANG)}) ln = argd['ln'] user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='search frequency', ln=ln) return page(title="Search frequency", body=perform_display_keyevent('search frequency', argd, req, ln=ln), navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \ (CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln='+ln) or ''), description="CDS, Statistics, Search frequency", keywords="CDS, statistics, search frequency", req=req, lastupdated=__lastupdated__, navmenuid='search frequency', language=ln) def search_type_distribution(self, req, form): """Search type distribution statistics page.""" user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') argd = wash_urlargd(form, {'timespan': (str, "today"), 'format': (str, SUITABLE_GRAPH_FORMAT), 'ln': (str, CFG_SITE_LANG)}) ln = argd['ln'] if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='search type distribution', ln=ln) return page(title="Search type distribution", body=perform_display_keyevent('search type distribution', argd, req, ln=ln), navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \ (CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln='+ln) or ''), description="CDS, Statistics, Search type distribution", keywords="CDS, statistics, search type distribution", req=req, lastupdated=__lastupdated__, navmenuid='search type distribution', language=ln) def download_frequency(self, req, form): """Download frequency statistics page.""" argd = wash_urlargd(form, {'timespan': (str, "today"), 'format': (str, SUITABLE_GRAPH_FORMAT), 'ln': (str, CFG_SITE_LANG)}) ln = argd['ln'] user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='download frequency', ln=ln) return page(title="Download frequency", body=perform_display_keyevent('download frequency', argd, req, ln=ln), navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \ (CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln='+ln) or ''), description="CDS, Statistics, Download frequency", keywords="CDS, statistics, download frequency", req=req, lastupdated=__lastupdated__, navmenuid='download frequency', language=ln) # CUSTOM EVENT SECTION def customevent(self, req, form): """Custom event statistics page""" arg_format = {'ids': (list, []), 'timespan': (str, "today"), 'format': (str, SUITABLE_GRAPH_FORMAT), 'ln': (str, CFG_SITE_LANG)} for key in form.keys(): if key[:4] == 'cols': i = key[4:] arg_format['cols'+i]=(list, []) arg_format['col_value'+i]=(list, []) arg_format['bool'+i]=(list, []) argd = wash_urlargd(form, arg_format) ln = argd['ln'] user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='custom event', ln=ln) body = perform_display_customevent(argd['ids'], argd, req=req, ln=ln) return page(title="Custom event", body=body, navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \ (CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln='+ln) or ''), description="CDS Personalize, Statistics, Custom event", keywords="CDS, statistics, custom event", req=req, lastupdated=__lastupdated__, navmenuid='custom event', language=ln) def customevent_help(self, req, form): """Custom event help page""" argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)}) ln = argd['ln'] user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='custom event help', ln=ln) return page(title="Custom event help", body=perform_display_customevent_help(ln=ln), navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \ (CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln='+ln) or ''), description="CDS Personalize, Statistics, Custom event help", keywords="CDS, statistics, custom event help", req=req, lastupdated=__lastupdated__, navmenuid='custom event help', language=ln) def customevent_register(self, req, form): """Register a customevent and reload to it defined url""" argd = wash_urlargd(form, {'id': (str, ""), 'arg': (str, ""), 'url': (str, ""), 'ln': (str, CFG_SITE_LANG)}) params = argd['arg'].split(',') if "WEBSTAT_IP" in params: index = params.index("WEBSTAT_IP") - params[index] = str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + params[index] = str(req.remote_ip) register_customevent(argd['id'], params) return redirect_to_url(req, unquote(argd['url']), apache.HTTP_MOVED_PERMANENTLY) # EXPORT SECTION def export(self, req, form): """Exports data""" argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)}) ln = argd['ln'] user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin') if auth_code: return page_not_authorized(req, navtrail=self.navtrail % {'ln_link':(ln != CFG_SITE_LANG and '?ln='+ln) or ''}, text=auth_msg, navmenuid='export', ln=ln) argd = wash_urlargd(form, {"filename": (str, ""), "mime": (str, "")}) # Check that the particular file exists and that it's OK to export webstat_files = [x for x in os.listdir(CFG_TMPDIR) if x.startswith("webstat")] if argd["filename"] not in webstat_files: return "Bad file." # Set correct header type req.content_type = argd["mime"] req.send_http_header() # Rebuild path, send it to the user, and clean up. filename = CFG_TMPDIR + '/' + argd["filename"] req.sendfile(filename) os.remove(filename) index = __call__ diff --git a/modules/webstyle/lib/Makefile.am b/modules/webstyle/lib/Makefile.am index 07134599d..1bf30c9f6 100644 --- a/modules/webstyle/lib/Makefile.am +++ b/modules/webstyle/lib/Makefile.am @@ -1,33 +1,38 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. pylibdir=$(libdir)/python/invenio +wsgiwebdir=$(localstatedir)/www-wsgi/ pylib_DATA = webdoc.py \ - webdoc_tests.py \ - webdoc_webinterface.py \ - webpage.py \ - template.py \ - webstyle_templates.py \ - webinterface_handler.py \ - webinterface_tests.py \ - webinterface_layout.py \ - fckeditor_invenio_connector.py + webdoc_tests.py \ + webdoc_webinterface.py \ + webpage.py \ + template.py \ + webstyle_templates.py \ + webinterface_handler.py \ + webinterface_tests.py \ + webinterface_layout.py \ + fckeditor_invenio_connector.py \ + webinterface_handler_wsgi.py \ + webinterface_handler_wsgi_utils.py + +wsgiweb_DATA = invenio.wsgi EXTRA_DIST = $(pylib_DATA) CLEANFILES = *~ *.tmp *.pyc diff --git a/modules/webstyle/lib/Makefile.am b/modules/webstyle/lib/invenio.wsgi similarity index 65% copy from modules/webstyle/lib/Makefile.am copy to modules/webstyle/lib/invenio.wsgi index 07134599d..50dc1588f 100644 --- a/modules/webstyle/lib/Makefile.am +++ b/modules/webstyle/lib/invenio.wsgi @@ -1,33 +1,23 @@ +# -*- coding: utf-8 -*- ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. -pylibdir=$(libdir)/python/invenio +""" +mod_wsgi Invenio application loader. +""" -pylib_DATA = webdoc.py \ - webdoc_tests.py \ - webdoc_webinterface.py \ - webpage.py \ - template.py \ - webstyle_templates.py \ - webinterface_handler.py \ - webinterface_tests.py \ - webinterface_layout.py \ - fckeditor_invenio_connector.py - -EXTRA_DIST = $(pylib_DATA) - -CLEANFILES = *~ *.tmp *.pyc +from invenio.webinterface_handler_wsgi import application diff --git a/modules/webstyle/lib/webinterface_handler.py b/modules/webstyle/lib/webinterface_handler.py index 9addcf9a8..522b89550 100644 --- a/modules/webstyle/lib/webinterface_handler.py +++ b/modules/webstyle/lib/webinterface_handler.py @@ -1,474 +1,453 @@ # -*- coding: utf-8 -*- ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Apache request handler mechanism. It gives the tools to map url to functions, handles the legacy url scheme (/search.py queries), HTTP/HTTPS switching, language specification,... """ __revision__ = "$Id$" import urlparse import cgi import sys import re import os import gc import time -# The following mod_python imports are done separately in a particular -# order (util first) because I was getting sometimes publisher import -# error when testing weird situations, preventing util from being -# imported and leading to a traceback later. When this happened, -# importing util was okay, only publisher import caused troubles, so -# that importing in special order prevents these problems. -try: - from mod_python import util - from mod_python import apache - from mod_python import publisher -except ImportError: - pass - +from invenio import webinterface_handler_wsgi_utils as apache from invenio.config import CFG_SITE_LANG, CFG_SITE_URL, CFG_SITE_SECURE_URL, CFG_TMPDIR from invenio.access_control_config import CFG_EXTERNAL_AUTH_USING_SSO from invenio.messages import wash_language from invenio.urlutils import redirect_to_url from invenio.errorlib import register_exception from invenio.webuser import get_preferred_user_language, isGuestUser, \ getUid, loginUser, update_Uid, isUserSuperAdmin, collect_user_info has_https_support = CFG_SITE_URL != CFG_SITE_SECURE_URL DEBUG = False # List of URIs for which the 'ln' argument must not be added # automatically no_lang_recognition_uris = ['/rss', '/oai2d', '/journal'] -def _debug(msg): +def _debug(req, msg): if DEBUG: - apache.log_error(msg, apache.APLOG_WARNING) - return + req.log_error(msg) def _check_result(req, result): """ Check that a page handler actually wrote something, and properly finish the apache request.""" - if result or req.bytes_sent > 0 or req.next: + if result or req.bytes_sent > 0: if result is None: result = "" else: result = str(result) # unless content_type was manually set, we will attempt # to guess it - if not req._content_type_set: + if not req.content_type_set_p: # make an attempt to guess content-type if result[:100].strip()[:6].lower() == '<html>' \ or result.find('</') > 0: req.content_type = 'text/html' else: req.content_type = 'text/plain' if req.header_only: if req.status in (apache.HTTP_NOT_FOUND, ): raise apache.SERVER_RETURN, req.status else: req.write(result) return apache.OK else: - req.log_error("mod_python.publisher: %s returned nothing." % `object`) + req.log_error("publisher: %s returned nothing." % `object`) return apache.HTTP_INTERNAL_SERVER_ERROR class TraversalError(Exception): pass class WebInterfaceDirectory(object): """ A directory groups web pages, and can delegate dispatching of requests to the actual handler. This has been heavily borrowed from Quixote's dispatching mechanism, with specific adaptations.""" # Lists the valid URLs contained in this directory. _exports = [] # Set this to True in order to redirect queries over HTTPS _force_https = False def _translate(self, component): """(component : string) -> string | None Translate a path component into a Python identifier. Returning None signifies that the component does not exist. """ if component in self._exports: if component == '': return 'index' # implicit mapping else: return component else: # check for an explicit external to internal mapping for value in self._exports: if isinstance(value, tuple): if value[0] == component: return value[1] else: return None def _lookup(self, component, path): """ Override this method if you need to map dynamic URLs. It can eat up as much of the remaining path as needed, and return the remaining parts, so that the traversal can continue. """ return None, path def _traverse(self, req, path, do_head=False, guest_p=True): """ Locate the handler of an URI by traversing the elements of the path.""" - _debug('traversing %r' % path) + _debug(req, 'traversing %r' % path) component, path = path[0], path[1:] name = self._translate(component) if name is None: obj, path = self._lookup(component, path) else: obj = getattr(self, name) if obj is None: - _debug('could not resolve %s' % repr((component, path))) + _debug(req, 'could not resolve %s' % repr((component, path))) raise TraversalError() # We have found the next segment. If we know that from this # point our subpages are over HTTPS, do the switch. if req.is_https() and self._force_https: if not req.is_https(): # We need to isolate the part of the URI that is after # CFG_SITE_URL, and append that to our CFG_SITE_SECURE_URL. original_parts = urlparse.urlparse(req.unparsed_uri) plain_prefix_parts = urlparse.urlparse(CFG_SITE_URL) secure_prefix_parts = urlparse.urlparse(CFG_SITE_SECURE_URL) # Compute the new path plain_path = original_parts[2] plain_path = secure_prefix_parts[2] + plain_path[len(plain_prefix_parts[2]):] # ...and recompose the complete URL final_parts = list(secure_prefix_parts) final_parts[2] = plain_path final_parts[-3:] = original_parts[-3:] target = urlparse.urlunparse(final_parts) redirect_to_url(req, target) if CFG_EXTERNAL_AUTH_USING_SSO and req.is_https() and guest_p: (iden, p_un, p_pw, msgcode) = loginUser(req, '', '', CFG_EXTERNAL_AUTH_USING_SSO) if len(iden)>0: uid = update_Uid(req, p_un) guest_p = False # Continue the traversal. If there is a path, continue # resolving, otherwise call the method as it is our final # renderer. We even pass it the parsed form arguments. if path: return obj._traverse(req, path, do_head, guest_p) if do_head: req.content_type = "text/html; charset=UTF-8" raise apache.SERVER_RETURN, apache.DONE - form = util.FieldStorage(req, keep_blank_values=True) - try: - # The auto recognition will work only with with mod_python-3.3.1 - if not form.has_key('ln') and \ - req.uri not in no_lang_recognition_uris: - ln = get_preferred_user_language(req) - form.add_field('ln', ln) - except: - form = dict(form) - if not form.has_key('ln') and \ - req.uri not in no_lang_recognition_uris: - ln = get_preferred_user_language(req) - form['ln'] = ln + form = req.form + if not form.has_key('ln') and \ + req.uri not in no_lang_recognition_uris: + ln = get_preferred_user_language(req) + form.add_field('ln', ln) result = _check_result(req, obj(req, form)) return result def __call__(self, req, form): """ Maybe resolve the final / of a directory """ # When this method is called, we either are a directory which # has an 'index' method, and we redirect to it, or we don't # have such a method, in which case it is a traversal error. if "" in self._exports: if not form: # Fix missing trailing slash as a convenience, unless # we are processing a form (in which case it is better # to fix the form posting). util.redirect(req, req.uri + "/", permanent=True) - _debug('directory %r is not callable' % self) + _debug(req, 'directory %r is not callable' % self) raise TraversalError() re_slashes = re.compile('/+') re_special_uri = re.compile('^/record/\d+|^/collection/.+') def create_handler(root): """ Return a handler function that will dispatch apache requests through the URL layout passed in parameter.""" def _profiler(req): """ This handler wrap the default handler with a profiler. Profiling data is written into CFG_TMPDIR/invenio-profile-stats-datetime.raw, and is displayed at the bottom of the webpage. To use add profile=1 to your url. To change sorting algorithm you can provide profile=algorithm_name. You can add more than one profile requirement like ?profile=time&profile=cumulative. The list of available algorithm is displayed at the end of the profile. """ args = {} if req.args: args = cgi.parse_qs(req.args) if 'profile' in args: if not isUserSuperAdmin(collect_user_info(req)): return _handler(req) if 'memory' in args['profile']: gc.set_debug(gc.DEBUG_LEAK) ret = _handler(req) req.write("\n<pre>%s</pre>" % gc.garbage) gc.collect() req.write("\n<pre>%s</pre>" % gc.garbage) gc.set_debug(0) return ret from cStringIO import StringIO try: import pstats except ImportError: ret = _handler(req) req.write("<pre>%s</pre>" % "The Python Profiler is not installed!") return ret import datetime date = datetime.datetime.now().strftime('%Y%m%d%H%M%S') filename = '%s/invenio-profile-stats-%s.raw' % (CFG_TMPDIR, date) existing_sorts = pstats.Stats.sort_arg_dict_default.keys() required_sorts = [] profile_dump = [] for sort in args['profile']: if sort not in existing_sorts: sort = 'cumulative' if sort not in required_sorts: required_sorts.append(sort) if sys.hexversion < 0x02050000: import hotshot, hotshot.stats pr = hotshot.Profile(filename) ret = pr.runcall(_handler, req) for sort_type in required_sorts: tmp_out = sys.stdout sys.stdout = StringIO() hotshot.stats.load(filename).strip_dirs().sort_stats(sort_type).print_stats() profile_dump.append(sys.stdout.getvalue()) sys.stdout = tmp_out else: import cProfile pr = cProfile.Profile() ret = pr.runcall(_handler, req) pr.dump_stats(filename) for sort_type in required_sorts: strstream = StringIO() pstats.Stats(filename, stream=strstream).strip_dirs().sort_stats(sort_type).print_stats() profile_dump.append(strstream.getvalue()) profile_dump = '\n'.join(profile_dump) profile_dump += '\nYou can use profile=%s or profile=memory' % existing_sorts req.write("\n<pre>%s</pre>" % profile_dump) return ret else: return _handler(req) def _handler(req): """ This handler is invoked by mod_python with the apache request.""" try: allowed_methods = ("GET", "POST", "HEAD", "OPTIONS") req.allow_methods(allowed_methods, 1) if req.method not in allowed_methods: raise apache.SERVER_RETURN, apache.HTTP_METHOD_NOT_ALLOWED if req.method == 'OPTIONS': ## OPTIONS is used to now which method are allowed req.headers_out['Allow'] = ', '.join(allowed_methods) raise apache.SERVER_RETURN, apache.OK # Set user agent for fckeditor.py, which needs it here os.environ["HTTP_USER_AGENT"] = req.headers_in.get('User-Agent', '') guest_p = isGuestUser(getUid(req)) uri = req.uri if uri == '/': path = [''] else: ## Let's collapse multiple slashes into a single / uri = re_slashes.sub('/', uri) path = uri[1:].split('/') if uri.startswith('/yours') or not guest_p: ## Private/personalized request should not be cached req.headers_out['Cache-Control'] = 'private, no-cache, no-store, max-age=0, must-revalidate' req.headers_out['Pragma'] = 'no-cache' req.headers_out['Vary'] = '*' else: req.headers_out['Cache-Control'] = 'public, max-age=3600' req.headers_out['Vary'] = 'Cookie, ETag, Cache-Control' try: if req.header_only and not re_special_uri.match(req.uri): return root._traverse(req, path, True, guest_p) else: ## bibdocfile have a special treatment for HEAD return root._traverse(req, path, False, guest_p) except TraversalError: raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND except apache.SERVER_RETURN: ## This is one of mod_python way of communicating raise except IOError, exc: if 'Write failed, client closed connection' not in "%s" % exc: ## Workaround for considering as false positive exceptions ## rised by mod_python when the user close the connection ## or in some other rare and not well identified cases. register_exception(req=req, alert_admin=True) raise except Exception: register_exception(req=req, alert_admin=True) raise # Serve an error by default. raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND finally: if hasattr(req, '_session'): ## The session handler saves for caching a request_wrapper ## in req. ## This saves req as an attribute, creating a circular ## reference. ## Since we have have reached the end of the request handler ## we can safely drop the request_wrapper so to avoid ## memory leaks. delattr(req, '_session') if hasattr(req, '_user_info'): ## For the same reason we can delete the user_info. delattr(req, '_user_info') ## as suggested in ## <http://www.python.org/doc/2.3.5/lib/module-gc.html> del gc.garbage[:] return _profiler def wash_urlargd(form, content): """ Wash the complete form based on the specification in content. Content is a dictionary containing the field names as a key, and a tuple (type, default) as value. 'type' can be list, str, int, tuple, or mod_python.util.Field (for file uploads). The specification automatically includes the 'ln' field, which is common to all queries. Arguments that are not defined in 'content' are discarded. Note that in case {list,tuple} were asked for, we assume that {list,tuple} of strings is to be returned. Therefore beware when you want to use wash_urlargd() for multiple file upload forms. @Return: argd dictionary that can be used for passing function parameters by keywords. """ result = {} content['ln'] = (str, '') for k, (dst_type, default) in content.items(): try: value = form[k] except KeyError: result[k] = default continue src_type = type(value) # First, handle the case where we want all the results. In # this case, we need to ensure all the elements are strings, # and not Field instances. if src_type in (list, tuple): if dst_type is list: result[k] = [str(x) for x in value] continue if dst_type is tuple: result[k] = tuple([str(x) for x in value]) continue # in all the other cases, we are only interested in the # first value. value = value[0] # Maybe we already have what is expected? Then don't change # anything. if src_type is dst_type: result[k] = value continue # Since we got here, 'value' is sure to be a single symbol, # not a list kind of structure anymore. if dst_type in (str, int): try: result[k] = dst_type(value) except: result[k] = default elif dst_type is tuple: result[k] = (str(value),) elif dst_type is list: result[k] = [str(value)] else: raise ValueError('cannot cast form into type %r' % dst_type) result['ln'] = wash_language(result['ln']) return result diff --git a/modules/webstyle/lib/webinterface_handler_wsgi.py b/modules/webstyle/lib/webinterface_handler_wsgi.py new file mode 100644 index 000000000..71b02ab76 --- /dev/null +++ b/modules/webstyle/lib/webinterface_handler_wsgi.py @@ -0,0 +1,433 @@ +# -*- coding: utf-8 -*- +## This file is part of CDS Invenio. +## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. +## +## CDS Invenio is free software; you can redistribute it and/or +## modify it under the terms of the GNU General Public License as +## published by the Free Software Foundation; either version 2 of the +## License, or (at your option) any later version. +## +## CDS Invenio is distributed in the hope that it will be useful, but +## WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +## General Public License for more details. +## +## You should have received a copy of the GNU General Public License +## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., +## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. + +"""mod_python->WSGI Framework""" + +import sys +import os +from cgi import parse_qs + +from wsgiref.validate import validator +from wsgiref.util import setup_testing_defaults, FileWrapper, \ + guess_scheme + +if __name__ != "__main__": + # Chances are that we are inside mod_wsgi. + ## You can't write to stdout in mod_wsgi, but some of our + ## dependecies do this! (e.g. 4Suite) + sys.stdout = sys.stderr + +from invenio.webinterface_layout import invenio_handler +from invenio.webinterface_handler_wsgi_utils import table, FieldStorage, \ + HTTP_STATUS_MAP, SERVER_RETURN, OK, DONE, \ + HTTP_NOT_FOUND +from invenio.config import CFG_WEBDIR +from invenio.errorlib import register_exception + +## Static files are usually handled directly by the webserver (e.g. Apache) +## However in case WSGI is required to handle static files too (such +## as when running wsgiref simple server), then this flag can be +## turned on (it is done automatically by wsgi_handler_test). +CFG_WSGI_SERVE_STATIC_FILES = False + +class InputProcessed(object): + """ + Auxiliary class used when reading input. + @see: <http://www.wsgi.org/wsgi/Specifications/handling_post_forms>. + """ + def read(self, *args): + raise EOFError('The wsgi.input stream has already been consumed') + readline = readlines = __iter__ = read + +class SimulatedModPythonRequest(object): + """ + mod_python like request object. + Minimum and cleaned implementation to make moving out of mod_python + easy. + @see: <http://www.modpython.org/live/current/doc-html/pyapi-mprequest.html> + """ + def __init__(self, environ, start_response): + self.__environ = environ + self.__start_response = start_response + self.__response_sent_p = False + self.__buffer = '' + self.__low_level_headers = [] + self.__headers = table(self.__low_level_headers) + self.__headers.add = self.__headers.add_header + self.__status = "200 OK" + self.__filename = None + self.__disposition_type = None + self.__bytes_sent = 0 + self.__allowed_methods = [] + self.__cleanups = [] + self.headers_out = self.__headers + ## See: <http://www.python.org/dev/peps/pep-0333/#the-write-callable> + self.__write = None + self.__errors = environ['wsgi.errors'] + self.__headers_in = table([]) + for key, value in environ.iteritems(): + if key.startswith('HTTP_'): + self.__headers_in[key[len('HTTP_'):]] = value + if environ.get('CONTENT_LENGTH'): + self.__headers_in['content-length'] = environ['CONTENT_LENGTH'] + if environ.get('CONTENT_TYPE'): + self.__headers_in['content-type'] = environ['CONTENT_TYPE'] + self.get_post_form() + + def get_wsgi_environ(self): + return self.__environ + + def get_post_form(self): + post_form = self.__environ.get('wsgi.post_form') + input = self.__environ['wsgi.input'] + if (post_form is not None + and post_form[0] is input): + return post_form[2] + # This must be done to avoid a bug in cgi.FieldStorage + self.__environ.setdefault('QUERY_STRING', '') + fs = FieldStorage(self, keep_blank_values=1) + new_input = InputProcessed() + post_form = (new_input, input, fs) + self.__environ['wsgi.post_form'] = post_form + self.__environ['wsgi.input'] = new_input + return fs + + def get_response_sent_p(self): + return self.__response_sent_p + + def get_low_level_headers(self): + return self.__low_level_headers + + def get_buffer(self): + return self.__buffer + + def write(self, string, flush=1): + self.__buffer += string + if flush: + self.flush() + + def flush(self): + self.send_http_header() + if self.__buffer: + self.__bytes_sent += len(self.__buffer) + self.__write(self.__buffer) + self.__buffer = '' + + def set_content_type(self, content_type): + self.__headers['content-type'] = content_type + + def get_content_type(self): + return self.__headers['content-type'] + + def send_http_header(self): + if not self.__response_sent_p: + if self.__allowed_methods and self.__status.startswith('405 ') or self.__status.startswith('501 '): + self.__headers['Allow'] = ', '.join(self.__allowed_methods) + + ## See: <http://www.python.org/dev/peps/pep-0333/#the-write-callable> + #print self.__low_level_headers + self.__write = self.__start_response(self.__status, self.__low_level_headers) + self.__response_sent_p = True + #print "Response sent: %s" % self.__headers + + def get_unparsed_uri(self): + return '?'.join([self.__environ['PATH_INFO'], self.__environ['QUERY_STRING']]) + + def get_uri(self): + return self.__environ['PATH_INFO'] + + def get_headers_in(self): + return self.__headers_in + + def get_subprocess_env(self): + return self.__environ + + def add_common_vars(self): + pass + + def get_args(self): + return self.__environ['QUERY_STRING'] + + def get_remote_ip(self): + return self.__environ.get('REMOTE_ADDR') + + def get_remote_host(self): + return self.__environ.get('REMOTE_HOST') + + def get_header_only(self): + return self.__environ['REQUEST_METHOD'] == 'HEAD' + + def set_status(self, status): + self.__status = '%s %s' % (status, HTTP_STATUS_MAP.get(int(status), 'Explanation not available')) + + def get_status(self): + return int(self.__status.split(' ')[0]) + + def get_wsgi_status(self): + return self.__status + + def sendfile(self, path, offset=0, the_len=-1): + try: + self.send_http_header() + file_to_send = open(path) + file_to_send.seek(offset) + file_wrapper = FileWrapper(file_to_send) + count = 0 + if the_len < 0: + for chunk in file_wrapper: + count += len(chunk) + self.__bytes_sent += len(chunk) + self.__write(chunk) + else: + for chunk in file_wrapper: + if the_len >= len(chunk): + the_len -= len(chunk) + count += len(chunk) + self.__bytes_sent += len(chunk) + self.__write(chunk) + else: + count += the_len + self.__bytes_sent += the_len + self.__write(chunk[:the_len]) + break + except Exception, err: + raise IOError(str(err)) + return self.__bytes_sent + + def set_content_length(self, content_length): + if content_length is not None: + self.__headers['content-length'] = str(content_length) + else: + del self.__headers['content-length'] + + def is_https(self): + return int(guess_scheme(self.__environ) == 'https') + + def get_method(self): + return self.__environ['REQUEST_METHOD'] + + def get_hostname(self): + return self.__environ('HTTP_HOST', '') + + def set_filename(self, filename): + self.__filename = filename + if self.__disposition_type is None: + self.__disposition_type = 'inline' + self.__headers['content-disposition'] = '%s; filename=%s' % (self.__disposition_type, self.__filename) + + def set_encoding(self, encoding): + if encoding: + self.__headers['content-encoding'] = str(encoding) + else: + del self.__headers['content-encoding'] + + def get_bytes_sent(self): + return self.__bytes_sent + + def log_error(self, message): + self.__errors.write(message.strip() + '\n') + + def get_content_type_set_p(self): + return bool(self.__headers['content-type']) + + def allow_methods(self, methods, reset=0): + if reset: + self.__allowed_methods = [] + self.__allowed_methods += [method.upper().strip() for method in methods] + + def get_allowed_methods(self): + return self.__allowed_methods + + def readline(self, hint=None): + try: + return self.__environ['wsgi.input'].readline(hint) + except TypeError: + ## the hint param is not part of wsgi pep, although + ## it's great to exploit it in when reading FORM + ## with large files, in order to avoid filling up the memory + ## Too bad it's not there :-( + return self.__environ['wsgi.input'].readline() + + def readlines(self, hint=None): + return self.__environ['wsgi.input'].readlines(hint) + + def read(self, hint=None): + return self.__environ['wsgi.input'].read(hint) + + def register_cleanup(self, callback, data=None): + self.__cleanups.append((callback, data)) + + def get_cleanups(self): + return self.__cleanups + + content_type = property(get_content_type, set_content_type) + unparsed_uri = property(get_unparsed_uri) + uri = property(get_uri) + headers_in = property(get_headers_in) + subprocess_env = property(get_subprocess_env) + args = property(get_args) + header_only = property(get_header_only) + status = property(get_status, set_status) + method = property(get_method) + hostname = property(get_hostname) + filename = property(fset=set_filename) + encoding = property(fset=set_encoding) + bytes_sent = property(get_bytes_sent) + content_type_set_p = property(get_content_type_set_p) + allowed_methods = property(get_allowed_methods) + response_sent_p = property(get_response_sent_p) + form = property(get_post_form) + remote_ip = property(get_remote_ip) + remote_host = property(get_remote_host) + +def application(environ, start_response): + """ + Entry point for wsgi. + """ + ## Needed for mod_wsgi, see: <http://code.google.com/p/modwsgi/wiki/ApplicationIssues> + setup_testing_defaults(environ) + req = SimulatedModPythonRequest(environ, start_response) + #print 'Starting mod_python simulation' + try: + try: + possible_module, possible_handler = is_mp_legacy_publisher_path(environ['PATH_INFO']) + if possible_module is not None: + mp_legacy_publisher(req, possible_module, possible_handler) + elif CFG_WSGI_SERVE_STATIC_FILES: + possible_static_path = is_static_path(environ['PATH_INFO']) + if possible_static_path is not None: + from invenio.bibdocfile import stream_file + stream_file(req, possible_static_path) + else: + ret = invenio_handler(req) + else: + ret = invenio_handler(req) + req.flush() + except SERVER_RETURN, status: + status = int(str(status)) + if status not in (OK, DONE): + req.status = status + if not req.response_sent_p: + start_response(req.get_wsgi_status(), req.get_low_level_headers(), sys.exc_info()) + else: + req.flush() + except Exception: + register_exception(req=req, alert_admin=True) + start_response(req.get_wsgi_status(), req.get_low_level_headers(), sys.exc_info()) + finally: + for (callback, data) in req.get_cleanups(): + callback(data) + return [] + +def is_static_path(path): + """ + Returns True if path corresponds to an exsting file under CFG_WEBDIR. + @param path: the path. + @type path: string + @return: True if path corresponds to an exsting file under CFG_WEBDIR. + @rtype: bool + """ + path = os.path.abspath(CFG_WEBDIR + path) + if path.startswith(CFG_WEBDIR) and os.path.isfile(path): + return path + return None + +def is_mp_legacy_publisher_path(path): + """ + Checks path corresponds to an exsting Python file under CFG_WEBDIR. + @param path: the path. + @type path: string + @return: the path of the module to load and the function to call there. + @rtype: tuple + """ + path = path.split('/') + for index, component in enumerate(path): + if component.endswith('.py'): + possible_module = os.path.abspath(CFG_WEBDIR + os.path.sep + os.path.sep.join(path[:index + 1])) + possible_handler = '/'.join(path[index + 1:]).strip() + if not possible_handler: + possible_handler = 'index' + if os.path.exists(possible_module) and possible_module.startswith(CFG_WEBDIR): + return (possible_module, possible_handler) + else: + return None, None + +def mp_legacy_publisher(req, possible_module, possible_handler): + """ + mod_python legacy publisher minimum implementation. + """ + the_module = open(possible_module).read() + module_globals = {} + exec(the_module, module_globals) + if possible_handler in module_globals: + from invenio.webinterface_handler import _check_result + ## the req.form must be casted to dict because of Python 2.4 and earlier + ## otherwise any object exposing the mapping interface can be + ## used with the magic ** + return _check_result(req, module_globals[possible_handler](req, **dict(req.form))) + else: + raise SERVER_RETURN, HTTP_NOT_FOUND + +def check_wsgiref_testing_feasability(): + """ + In order to use wsgiref for running Invenio, CFG_SITE_URL and + CFG_SITE_SECURE_URL must not use HTTPS because SSL is not supported. + """ + from invenio.config import CFG_SITE_URL, CFG_SITE_SECURE_URL + if CFG_SITE_URL.lower().startswith('https'): + print >> sys.stderr, """ +ERROR: SSL is not supported by the wsgiref simple server implementation. +Please set CFG_SITE_URL not to start with "https". +Currently CFG_SITE_URL is set to: "%s".""" % CFG_SITE_URL + sys.exit(1) + if CFG_SITE_SECURE_URL.lower().startswith('https'): + print >> sys.stderr, """ +ERROR: SSL is not supported by the wsgiref simple server implementation. +Please set CFG_SITE_SECURE_URL not to start with "https". +Currently CFG_SITE_SECURE_URL is set to: "%s".""" % CFG_SITE_SECURE_URL + sys.exit(1) + +def wsgi_handler_test(port=80): + """ + Simple WSGI testing environment based on wsgiref. + """ + from wsgiref.simple_server import make_server + global CFG_WSGI_SERVE_STATIC_FILES + CFG_WSGI_SERVE_STATIC_FILES = True + check_wsgiref_testing_feasability() + validator_app = validator(application) + httpd = make_server('', port, application) + print "Serving on port %s..." % port + httpd.serve_forever() + +def main(): + from optparse import OptionParser + parser = OptionParser() + parser.add_option('-t', '--test', action='store_true', + dest='test', default=False, + help="Run a WSGI test server via wsgiref (not using Apache).") + parser.add_option('-p', '--port', type='int', dest='port', default='80', + help="The port where the WSGI test server will listen. [80]") + (options, args) = parser.parse_args() + if options.test: + wsgi_handler_test(options.port) + else: + parser.print_help() + +if __name__ == "__main__": + main() diff --git a/modules/webstyle/lib/webinterface_handler_wsgi_utils.py b/modules/webstyle/lib/webinterface_handler_wsgi_utils.py new file mode 100644 index 000000000..4eb1b4161 --- /dev/null +++ b/modules/webstyle/lib/webinterface_handler_wsgi_utils.py @@ -0,0 +1,1060 @@ +# -*- coding: utf-8 -*- +## This file is part of CDS Invenio. +## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. +## +## CDS Invenio is free software; you can redistribute it and/or +## modify it under the terms of the GNU General Public License as +## published by the Free Software Foundation; either version 2 of the +## License, or (at your option) any later version. +## +## CDS Invenio is distributed in the hope that it will be useful, but +## WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +## General Public License for more details. +## +## You should have received a copy of the GNU General Public License +## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., +## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. + +""" +mod_python->WSGI Framework utilities + +This code has been taken from mod_python original source code and rearranged +here to easying the migration from mod_python to wsgi. + +The code taken from mod_python is under the following License. +""" + + # Copyright 2004 Apache Software Foundation + # + # Licensed under the Apache License, Version 2.0 (the "License"); you + # may not use this file except in compliance with the License. You + # may obtain a copy of the License at + # + # http://www.apache.org/licenses/LICENSE-2.0 + # + # Unless required by applicable law or agreed to in writing, software + # distributed under the License is distributed on an "AS IS" BASIS, + # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + # implied. See the License for the specific language governing + # permissions and limitations under the License. + # + # Originally developed by Gregory Trubetskoy. + # + # $Id: apache.py 468216 2006-10-27 00:54:12Z grahamd $ + +try: + import threading +except: + import dummy_threading as threading +from wsgiref.headers import Headers +import time +import re +import cgi +import cStringIO +import tempfile +from types import TypeType, ClassType, BuiltinFunctionType, MethodType, ListType + +# Cache for values of PythonPath that have been seen already. +_path_cache = {} +_path_cache_lock = threading.Lock() + +class table(Headers): + add = Headers.add_header + iteritems = Headers.items + def __getitem__(self, name): + ret = Headers.__getitem__(self, name) + if ret is None: + return '' + else: + return str(ret) + + +## Some functions made public +exists_config_define = lambda dummy: True + +## Some constants + +HTTP_CONTINUE = 100 +HTTP_SWITCHING_PROTOCOLS = 101 +HTTP_PROCESSING = 102 +HTTP_OK = 200 +HTTP_CREATED = 201 +HTTP_ACCEPTED = 202 +HTTP_NON_AUTHORITATIVE = 203 +HTTP_NO_CONTENT = 204 +HTTP_RESET_CONTENT = 205 +HTTP_PARTIAL_CONTENT = 206 +HTTP_MULTI_STATUS = 207 +HTTP_MULTIPLE_CHOICES = 300 +HTTP_MOVED_PERMANENTLY = 301 +HTTP_MOVED_TEMPORARILY = 302 +HTTP_SEE_OTHER = 303 +HTTP_NOT_MODIFIED = 304 +HTTP_USE_PROXY = 305 +HTTP_TEMPORARY_REDIRECT = 307 +HTTP_BAD_REQUEST = 400 +HTTP_UNAUTHORIZED = 401 +HTTP_PAYMENT_REQUIRED = 402 +HTTP_FORBIDDEN = 403 +HTTP_NOT_FOUND = 404 +HTTP_METHOD_NOT_ALLOWED = 405 +HTTP_NOT_ACCEPTABLE = 406 +HTTP_PROXY_AUTHENTICATION_REQUIRED = 407 +HTTP_REQUEST_TIME_OUT = 408 +HTTP_CONFLICT = 409 +HTTP_GONE = 410 +HTTP_LENGTH_REQUIRED = 411 +HTTP_PRECONDITION_FAILED = 412 +HTTP_REQUEST_ENTITY_TOO_LARGE = 413 +HTTP_REQUEST_URI_TOO_LARGE = 414 +HTTP_UNSUPPORTED_MEDIA_TYPE = 415 +HTTP_RANGE_NOT_SATISFIABLE = 416 +HTTP_EXPECTATION_FAILED = 417 +HTTP_UNPROCESSABLE_ENTITY = 422 +HTTP_LOCKED = 423 +HTTP_FAILED_DEPENDENCY = 424 +HTTP_UPGRADE_REQUIRED = 426 +HTTP_INTERNAL_SERVER_ERROR = 500 +HTTP_NOT_IMPLEMENTED = 501 +HTTP_BAD_GATEWAY = 502 +HTTP_SERVICE_UNAVAILABLE = 503 +HTTP_GATEWAY_TIME_OUT = 504 +HTTP_VERSION_NOT_SUPPORTED = 505 +HTTP_VARIANT_ALSO_VARIES = 506 +HTTP_INSUFFICIENT_STORAGE = 507 +HTTP_NOT_EXTENDED = 510 + +APLOG_NOERRNO = 8 + +OK = REQ_PROCEED = 0 +DONE = -2 +DECLINED = REQ_NOACTION = -1 + +_status_values = { + "postreadrequesthandler": [ DECLINED, OK ], + "transhandler": [ DECLINED ], + "maptostoragehandler": [ DECLINED ], + "inithandler": [ DECLINED, OK ], + "headerparserhandler": [ DECLINED, OK ], + "accesshandler": [ DECLINED, OK ], + "authenhandler": [ DECLINED ], + "authzhandler": [ DECLINED ], + "typehandler": [ DECLINED ], + "fixuphandler": [ DECLINED, OK ], + "loghandler": [ DECLINED, OK ], + "handler": [ OK ], +} + +# legacy/mod_python things +REQ_ABORTED = HTTP_INTERNAL_SERVER_ERROR +REQ_EXIT = "REQ_EXIT" +PROG_TRACEBACK = "PROG_TRACEBACK" + +# the req.finfo tuple +FINFO_MODE = 0 +FINFO_INO = 1 +FINFO_DEV = 2 +FINFO_NLINK = 3 +FINFO_UID = 4 +FINFO_GID = 5 +FINFO_SIZE = 6 +FINFO_ATIME = 7 +FINFO_MTIME = 8 +FINFO_CTIME = 9 +FINFO_FNAME = 10 +FINFO_NAME = 11 +FINFO_FILETYPE = 12 + +# the req.parsed_uri +URI_SCHEME = 0 +URI_HOSTINFO = 1 +URI_USER = 2 +URI_PASSWORD = 3 +URI_HOSTNAME = 4 +URI_PORT = 5 +URI_PATH = 6 +URI_QUERY = 7 +URI_FRAGMENT = 8 + +# for req.proxyreq +PROXYREQ_NONE = 0 # No proxy +PROXYREQ_PROXY = 1 # Standard proxy +PROXYREQ_REVERSE = 2 # Reverse proxy +PROXYREQ_RESPONSE = 3 # Origin response + +# methods for req.allow_method() +M_GET = 0 # RFC 2616: HTTP +M_PUT = 1 +M_POST = 2 +M_DELETE = 3 +M_CONNECT = 4 +M_OPTIONS = 5 +M_TRACE = 6 # RFC 2616: HTTP +M_PATCH = 7 +M_PROPFIND = 8 # RFC 2518: WebDAV +M_PROPPATCH = 9 +M_MKCOL = 10 +M_COPY = 11 +M_MOVE = 12 +M_LOCK = 13 +M_UNLOCK = 14 # RFC2518: WebDAV +M_VERSION_CONTROL = 15 # RFC3253: WebDAV Versioning +M_CHECKOUT = 16 +M_UNCHECKOUT = 17 +M_CHECKIN = 18 +M_UPDATE = 19 +M_LABEL = 20 +M_REPORT = 21 +M_MKWORKSPACE = 22 +M_MKACTIVITY = 23 +M_BASELINE_CONTROL = 24 +M_MERGE = 25 +M_INVALID = 26 # RFC3253: WebDAV Versioning + +# for req.used_path_info +AP_REQ_ACCEPT_PATH_INFO = 0 # Accept request given path_info +AP_REQ_REJECT_PATH_INFO = 1 # Send 404 error if path_info was given +AP_REQ_DEFAULT_PATH_INFO = 2 # Module's choice for handling path_info + + +# for mpm_query +AP_MPMQ_NOT_SUPPORTED = 0 # This value specifies whether + # an MPM is capable of + # threading or forking. +AP_MPMQ_STATIC = 1 # This value specifies whether + # an MPM is using a static # of + # threads or daemons. +AP_MPMQ_DYNAMIC = 2 # This value specifies whether + # an MPM is using a dynamic # of + # threads or daemons. + +AP_MPMQ_MAX_DAEMON_USED = 1 # Max # of daemons used so far +AP_MPMQ_IS_THREADED = 2 # MPM can do threading +AP_MPMQ_IS_FORKED = 3 # MPM can do forking +AP_MPMQ_HARD_LIMIT_DAEMONS = 4 # The compiled max # daemons +AP_MPMQ_HARD_LIMIT_THREADS = 5 # The compiled max # threads +AP_MPMQ_MAX_THREADS = 6 # # of threads/child by config +AP_MPMQ_MIN_SPARE_DAEMONS = 7 # Min # of spare daemons +AP_MPMQ_MIN_SPARE_THREADS = 8 # Min # of spare threads +AP_MPMQ_MAX_SPARE_DAEMONS = 9 # Max # of spare daemons +AP_MPMQ_MAX_SPARE_THREADS = 10 # Max # of spare threads +AP_MPMQ_MAX_REQUESTS_DAEMON = 11 # Max # of requests per daemon +AP_MPMQ_MAX_DAEMONS = 12 # Max # of daemons by config + +# magic mime types +CGI_MAGIC_TYPE = "application/x-httpd-cgi" +INCLUDES_MAGIC_TYPE = "text/x-server-parsed-html" +INCLUDES_MAGIC_TYPE3 = "text/x-server-parsed-html3" +DIR_MAGIC_TYPE = "httpd/unix-directory" + +# for req.read_body +REQUEST_NO_BODY = 0 +REQUEST_CHUNKED_ERROR = 1 +REQUEST_CHUNKED_DECHUNK = 2 + +# for apache.stat() +APR_FINFO_LINK = 0x00000001 # Stat the link not the file itself if it is a link +APR_FINFO_MTIME = 0x00000010 # Modification Time +APR_FINFO_CTIME = 0x00000020 # Creation or inode-changed time +APR_FINFO_ATIME = 0x00000040 # Access Time +APR_FINFO_SIZE = 0x00000100 # Size of the file +APR_FINFO_CSIZE = 0x00000200 # Storage size consumed by the file +APR_FINFO_DEV = 0x00001000 # Device +APR_FINFO_INODE = 0x00002000 # Inode +APR_FINFO_NLINK = 0x00004000 # Number of links +APR_FINFO_TYPE = 0x00008000 # Type +APR_FINFO_USER = 0x00010000 # User +APR_FINFO_GROUP = 0x00020000 # Group +APR_FINFO_UPROT = 0x00100000 # User protection bits +APR_FINFO_GPROT = 0x00200000 # Group protection bits +APR_FINFO_WPROT = 0x00400000 # World protection bits +APR_FINFO_ICASE = 0x01000000 # if dev is case insensitive +APR_FINFO_NAME = 0x02000000 # ->name in proper case +APR_FINFO_MIN = 0x00008170 # type, mtime, ctime, atime, size +APR_FINFO_IDENT = 0x00003000 # dev and inode +APR_FINFO_OWNER = 0x00030000 # user and group +APR_FINFO_PROT = 0x00700000 # all protections +APR_FINFO_NORM = 0x0073b170 # an atomic unix apr_stat() +APR_FINFO_DIRENT = 0x02000000 # an atomic unix apr_dir_read() + +HTTP_STATUS_MAP = { + 100: "Continue", + 101: "Switching Protocols", + 200: "OK", + 201: "Created", + 202: "Accepted", + 203: "Non-Authoritative Information", + 204: "No Content", + 205: "Reset Content", + 206: "Partial Content", + 300: "Multiple Choices", + 301: "Moved Permanently", + 302: "Found", + 303: "See Other", + 304: "Not Modified", + 305: "Use Proxy", + 307: "Temporary Redirect", + 400: "Bad Request", + 401: "Unauthorized", + 402: "Payment Required", + 403: "Forbidden", + 404: "Not Found", + 405: "Method Not Allowed", + 406: "Not Acceptable", + 407: "Proxy Authentication Required", + 408: "Request Time-out", + 4090: "Conflict", + 4101: "Gone", + 4112: "Length Required", + 4123: "Precondition Failed", + 4134: "Request Entity Too Large", + 4145: "Request-URI Too Large", + 4156: "Unsupported Media Type", + 4167: "Requested range not satisfiable", + 4178: "Expectation Failed", + 500: "Internal Server Error", + 501: "Not Implemented", + 502: "Bad Gateway", + 503: "Service Unavailable", + 504: "Gateway Time-out", + 505: "HTTP Version not supported", +} + + +class SERVER_RETURN(Exception): + pass + +class CookieError(Exception): + pass + +class metaCookie(type): + + def __new__(cls, clsname, bases, clsdict): + + _valid_attr = ( + "version", "path", "domain", "secure", + "comment", "expires", "max_age", + # RFC 2965 + "commentURL", "discard", "port", + # Microsoft Extension + "httponly" ) + + # _valid_attr + property values + # (note __slots__ is a new Python feature, it + # prevents any other attribute from being set) + __slots__ = _valid_attr + ("name", "value", "_value", + "_expires", "__data__") + + clsdict["_valid_attr"] = _valid_attr + clsdict["__slots__"] = __slots__ + + def set_expires(self, value): + + if type(value) == type(""): + # if it's a string, it should be + # valid format as per Netscape spec + try: + t = time.strptime(value, "%a, %d-%b-%Y %H:%M:%S GMT") + except ValueError: + raise ValueError, "Invalid expires time: %s" % value + t = time.mktime(t) + else: + # otherwise assume it's a number + # representing time as from time.time() + t = value + value = time.strftime("%a, %d-%b-%Y %H:%M:%S GMT", + time.gmtime(t)) + + self._expires = "%s" % value + + def get_expires(self): + return self._expires + + clsdict["expires"] = property(fget=get_expires, fset=set_expires) + + return type.__new__(cls, clsname, bases, clsdict) + +class Cookie(object): + """ + This class implements the basic Cookie functionality. Note that + unlike the Python Standard Library Cookie class, this class represents + a single cookie (not a list of Morsels). + """ + + __metaclass__ = metaCookie + + DOWNGRADE = 0 + IGNORE = 1 + EXCEPTION = 3 + + def parse(Class, str, **kw): + """ + Parse a Cookie or Set-Cookie header value, and return + a dict of Cookies. Note: the string should NOT include the + header name, only the value. + """ + + dict = _parse_cookie(str, Class, **kw) + return dict + + parse = classmethod(parse) + + def __init__(self, name, value, **kw): + + """ + This constructor takes at least a name and value as the + arguments, as well as optionally any of allowed cookie attributes + as defined in the existing cookie standards. + """ + self.name, self.value = name, value + + for k in kw: + setattr(self, k.lower(), kw[k]) + + # subclasses can use this for internal stuff + self.__data__ = {} + + + def __str__(self): + + """ + Provides the string representation of the Cookie suitable for + sending to the browser. Note that the actual header name will + not be part of the string. + + This method makes no attempt to automatically double-quote + strings that contain special characters, even though the RFC's + dictate this. This is because doing so seems to confuse most + browsers out there. + """ + + result = ["%s=%s" % (self.name, self.value)] + for name in self._valid_attr: + if hasattr(self, name): + if name in ("secure", "discard", "httponly"): + result.append(name) + else: + result.append("%s=%s" % (name, getattr(self, name))) + return "; ".join(result) + + def __repr__(self): + return '<%s: %s>' % (self.__class__.__name__, + str(self)) + +# This is a simplified and in some places corrected +# (at least I think it is) pattern from standard lib Cookie.py + +_cookiePattern = re.compile( + r"(?x)" # Verbose pattern + r"[,\ ]*" # space/comma (RFC2616 4.2) before attr-val is eaten + r"(?P<key>" # Start of group 'key' + r"[^;\ =]+" # anything but ';', ' ' or '=' + r")" # End of group 'key' + r"\ *(=\ *)?" # a space, then may be "=", more space + r"(?P<val>" # Start of group 'val' + r'"(?:[^\\"]|\\.)*"' # a doublequoted string + r"|" # or + r"[^;]*" # any word or empty string + r")" # End of group 'val' + r"\s*;?" # probably ending in a semi-colon + ) + +def _parse_cookie(str, Class, names=None): + # XXX problem is we should allow duplicate + # strings + result = {} + + matchIter = _cookiePattern.finditer(str) + + for match in matchIter: + key, val = match.group("key"), match.group("val") + + # We just ditch the cookies names which start with a dollar sign since + # those are in fact RFC2965 cookies attributes. See bug [#MODPYTHON-3]. + if key[0] != '$' and names is None or key in names: + result[key] = Class(key, val) + + return result + +def add_cookie(req, cookie, value="", **kw): + """ + Sets a cookie in outgoing headers and adds a cache + directive so that caches don't cache the cookie. + """ + + # is this a cookie? + if not isinstance(cookie, Cookie): + + # make a cookie + cookie = Cookie(cookie, value, **kw) + + if not req.headers_out.has_key("Set-Cookie"): + req.headers_out.add("Cache-Control", 'no-cache="set-cookie"') + + req.headers_out.add("Set-Cookie", str(cookie)) + +def get_cookies(req, Class=Cookie, **kw): + """ + A shorthand for retrieveing and parsing cookies given + a Cookie class. The class must be one of the classes from + this module. + """ + + if not req.headers_in.has_key("cookie"): + return {} + + cookies = req.headers_in["cookie"] + if type(cookies) == type([]): + cookies = '; '.join(cookies) + + return Class.parse(cookies, **kw) + +def get_cookie(req, name, Class=Cookie, **kw): + cookies = get_cookies(req, Class, names=[name], **kw) + if cookies.has_key(name): + return cookies[name] + + +parse_qs = cgi.parse_qs +parse_qsl = cgi.parse_qsl + +# Maximum line length for reading. (64KB) +# Fixes memory error when upload large files such as 700+MB ISOs. +readBlockSize = 65368 + +""" The classes below are a (almost) a drop-in replacement for the + standard cgi.py FieldStorage class. They should have pretty much the + same functionality. + + These classes differ in that unlike cgi.FieldStorage, they are not + recursive. The class FieldStorage contains a list of instances of + Field class. Field class is incapable of storing anything in it. + + These objects should be considerably faster than the ones in cgi.py + because they do not expect CGI environment, and are + optimized specifically for Apache and mod_python. +""" + +class Field: + def __init__(self, name, *args, **kwargs): + self.name = name + + # Some third party packages such as Trac create + # instances of the Field object and insert it + # directly into the list of form fields. To + # maintain backward compatibility check for + # where more than just a field name is supplied + # and invoke an additional initialisation step + # to process the arguments. Ideally, third party + # code should use the add_field() method of the + # form, but if they need to maintain backward + # compatibility with older versions of mod_python + # they will not have a choice but to use old + # way of doing things and thus we need this code + # for the forseeable future to cope with that. + + if args or kwargs: + self.__bc_init__(*args, **kwargs) + + def __bc_init__(self, file, ctype, type_options, + disp, disp_options, headers = {}): + self.file = file + self.type = ctype + self.type_options = type_options + self.disposition = disp + self.disposition_options = disp_options + if disp_options.has_key("filename"): + self.filename = disp_options["filename"] + else: + self.filename = None + self.headers = headers + + def __repr__(self): + """Return printable representation.""" + return "Field(%s, %s)" % (`self.name`, `self.value`) + + def __getattr__(self, name): + if name != 'value': + raise AttributeError, name + if self.file: + self.file.seek(0) + value = self.file.read() + self.file.seek(0) + else: + value = None + return value + + def __del__(self): + self.file.close() + +class StringField(str): + """ This class is basically a string with + added attributes for compatibility with std lib cgi.py. Basically, this + works the opposite of Field, as it stores its data in a string, but creates + a file on demand. Field creates a value on demand and stores data in a file. + """ + filename = None + headers = {} + ctype = "text/plain" + type_options = {} + disposition = None + disp_options = None + + # I wanted __init__(name, value) but that does not work (apparently, you + # cannot subclass str with a constructor that takes >1 argument) + def __init__(self, value): + '''Create StringField instance. You'll have to set name yourself.''' + str.__init__(self, value) + self.value = value + + def __getattr__(self, name): + if name != 'file': + raise AttributeError, name + self.file = cStringIO.StringIO(self.value) + return self.file + + def __repr__(self): + """Return printable representation (to pass unit tests).""" + return "Field(%s, %s)" % (`self.name`, `self.value`) + +class FieldList(list): + + def __init__(self): + self.__table = None + list.__init__(self) + + def table(self): + if self.__table is None: + self.__table = {} + for item in self: + if item.name in self.__table: + self.__table[item.name].append(item) + else: + self.__table[item.name] = [item] + return self.__table + + def __delitem__(self, *args): + self.__table = None + return list.__delitem__(self, *args) + + def __delslice__(self, *args): + self.__table = None + return list.__delslice__(self, *args) + + def __iadd__(self, *args): + self.__table = None + return list.__iadd__(self, *args) + + def __imul__(self, *args): + self.__table = None + return list.__imul__(self, *args) + + def __setitem__(self, *args): + self.__table = None + return list.__setitem__(self, *args) + + def __setslice__(self, *args): + self.__table = None + return list.__setslice__(self, *args) + + def append(self, *args): + self.__table = None + return list.append(self, *args) + + def extend(self, *args): + self.__table = None + return list.extend(self, *args) + + def insert(self, *args): + self.__table = None + return list.insert(self, *args) + + def pop(self, *args): + self.__table = None + return list.pop(self, *args) + + def remove(self, *args): + self.__table = None + return list.remove(self, *args) + + +class FieldStorage: + + def __init__(self, req, keep_blank_values=0, strict_parsing=0, file_callback=None, field_callback=None): + # + # Whenever readline is called ALWAYS use the max size EVEN when + # not expecting a long line. - this helps protect against + # malformed content from exhausting memory. + # + + self.list = FieldList() + + # always process GET-style parameters + if req.args: + pairs = parse_qsl(req.args, keep_blank_values) + for pair in pairs: + self.add_field(pair[0], pair[1]) + if req.method != "POST": + return + + try: + clen = int(req.headers_in["content-length"]) + except (KeyError, ValueError): + # absent content-length is not acceptable + raise SERVER_RETURN, HTTP_LENGTH_REQUIRED + + self.clen = clen + self.count = 0 + + if not req.headers_in.has_key("content-type"): + ctype = "application/x-www-form-urlencoded" + else: + ctype = req.headers_in["content-type"] + + if ctype.startswith("application/x-www-form-urlencoded"): + pairs = parse_qsl(req.read(clen), keep_blank_values) + for pair in pairs: + self.add_field(pair[0], pair[1]) + return + + + if not ctype.startswith("multipart/"): + # we don't understand this content-type + raise SERVER_RETURN, HTTP_NOT_IMPLEMENTED + + # figure out boundary + try: + i = ctype.lower().rindex("boundary=") + boundary = ctype[i+9:] + if len(boundary) >= 2 and boundary[0] == boundary[-1] == '"': + boundary = boundary[1:-1] + boundary = re.compile("--" + re.escape(boundary) + "(--)?\r?\n") + + except ValueError: + raise SERVER_RETURN, HTTP_BAD_REQUEST + + # read until boundary + self.read_to_boundary(req, boundary, None) + + end_of_stream = False + while not end_of_stream and not self.eof(): # jjj JIM BEGIN WHILE + ## parse headers + + ctype, type_options = "text/plain", {} + disp, disp_options = None, {} + headers = table([]) + line = req.readline(readBlockSize) + self.count += len(line) + if self.eof(): + end_of_stream = True + match = boundary.match(line) + if (not line) or match: + # we stop if we reached the end of the stream or a stop + # boundary (which means '--' after the boundary) we + # continue to the next part if we reached a simple + # boundary in either case this would mean the entity is + # malformed, but we're tolerating it anyway. + end_of_stream = (not line) or (match.group(1) is not None) + continue + + skip_this_part = False + while line not in ('\r','\r\n'): + nextline = req.readline(readBlockSize) + self.count += len(nextline) + if self.eof(): + end_of_stream = True + while nextline and nextline[0] in [ ' ', '\t']: + line = line + nextline + nextline = req.readline(readBlockSize) + self.count += len(nextline) + if self.eof(): + end_of_stream = True + # we read the headers until we reach an empty line + # NOTE : a single \n would mean the entity is malformed, but + # we're tolerating it anyway + h, v = line.split(":", 1) + headers.add(h, v) + h = h.lower() + if h == "content-disposition": + disp, disp_options = parse_header(v) + elif h == "content-type": + ctype, type_options = parse_header(v) + # + # NOTE: FIX up binary rubbish sent as content type + # from Microsoft IE 6.0 when sending a file which + # does not have a suffix. + # + if ctype.find('/') == -1: + ctype = 'application/octet-stream' + + line = nextline + match = boundary.match(line) + if (not line) or match: + # we stop if we reached the end of the stream or a + # stop boundary (which means '--' after the + # boundary) we continue to the next part if we + # reached a simple boundary in either case this + # would mean the entity is malformed, but we're + # tolerating it anyway. + skip_this_part = True + end_of_stream = (not line) or (match.group(1) is not None) + break + + if skip_this_part: + continue + + if disp_options.has_key("name"): + name = disp_options["name"] + else: + name = None + + # create a file object + # is this a file? + if disp_options.has_key("filename"): + if file_callback and callable(file_callback): + file = file_callback(disp_options["filename"]) + else: + file = tempfile.TemporaryFile("w+b") + else: + if field_callback and callable(field_callback): + file = field_callback() + else: + file = cStringIO.StringIO() + + # read it in + self.read_to_boundary(req, boundary, file) + if self.eof(): + end_of_stream = True + file.seek(0) + + # make a Field + if disp_options.has_key("filename"): + field = Field(name) + field.filename = disp_options["filename"] + else: + field = StringField(file.read()) + field.name = name + field.file = file + field.type = ctype + field.type_options = type_options + field.disposition = disp + field.disposition_options = disp_options + field.headers = headers + self.list.append(field) + + def add_field(self, key, value): + """Insert a field as key/value pair""" + item = StringField(value) + item.name = key + self.list.append(item) + + def __setitem__(self, key, value): + table = self.list.table() + if table.has_key(key): + items = table[key] + for item in items: + self.list.remove(item) + item = StringField(value) + item.name = key + self.list.append(item) + + def read_to_boundary(self, req, boundary, file): + previous_delimiter = None + while not self.eof(): + line = req.readline(readBlockSize) + self.count += len(line) + + if not line: + # end of stream + if file is not None and previous_delimiter is not None: + file.write(previous_delimiter) + return True + + match = boundary.match(line) + if match: + # the line is the boundary, so we bail out + # if the two last chars are '--' it is the end of the entity + return match.group(1) is not None + + if line[-2:] == '\r\n': + # the line ends with a \r\n, which COULD be part + # of the next boundary. We write the previous line delimiter + # then we write the line without \r\n and save it for the next + # iteration if it was not part of the boundary + if file is not None: + if previous_delimiter is not None: file.write(previous_delimiter) + file.write(line[:-2]) + previous_delimiter = '\r\n' + + elif line[-1:] == '\r': + # the line ends with \r, which is only possible if + # readBlockSize bytes have been read. In that case the + # \r COULD be part of the next boundary, so we save it + # for the next iteration + assert len(line) == readBlockSize + if file is not None: + if previous_delimiter is not None: file.write(previous_delimiter) + file.write(line[:-1]) + previous_delimiter = '\r' + + elif line == '\n' and previous_delimiter == '\r': + # the line us a single \n and we were in the middle of a \r\n, + # so we complete the delimiter + previous_delimiter = '\r\n' + + else: + if file is not None: + if previous_delimiter is not None: file.write(previous_delimiter) + file.write(line) + previous_delimiter = None + + def eof(self): + return self.clen <= self.count + + def __getitem__(self, key): + """Dictionary style indexing.""" + found = self.list.table()[key] + if len(found) == 1: + return found[0] + else: + return found + + def get(self, key, default): + try: + return self.__getitem__(key) + except (TypeError, KeyError): + return default + + def keys(self): + """Dictionary style keys() method.""" + return self.list.table().keys() + + def __iter__(self): + return iter(self.keys()) + + def __repr__(self): + return repr(self.list.table()) + + def has_key(self, key): + """Dictionary style has_key() method.""" + return (key in self.list.table()) + + __contains__ = has_key + + def __len__(self): + """Dictionary style len(x) support.""" + return len(self.list.table()) + + def getfirst(self, key, default=None): + """ return the first value received """ + try: + return self.list.table()[key][0] + except KeyError: + return default + + def getlist(self, key): + """ return a list of received values """ + try: + return self.list.table()[key] + except KeyError: + return [] + + def items(self): + """Dictionary-style items(), except that items are returned in the same + order as they were supplied in the form.""" + return [(item.name, item) for item in self.list] + + def __delitem__(self, key): + table = self.list.table() + values = table[key] + for value in values: + self.list.remove(value) + + def clear(self): + self.list = FieldList() + + +def parse_header(line): + """Parse a Content-type like header. + + Return the main content-type and a dictionary of options. + + """ + + plist = map(lambda a: a.strip(), line.split(';')) + key = plist[0].lower() + del plist[0] + pdict = {} + for p in plist: + i = p.find('=') + if i >= 0: + name = p[:i].strip().lower() + value = p[i+1:].strip() + if len(value) >= 2 and value[0] == value[-1] == '"': + value = value[1:-1] + pdict[name] = value + return key, pdict + +def apply_fs_data(object, fs, **args): + """ + Apply FieldStorage data to an object - the object must be + callable. Examine the args, and match then with fs data, + then call the object, return the result. + """ + + # we need to weed out unexpected keyword arguments + # and for that we need to get a list of them. There + # are a few options for callable objects here: + + fc = None + expected = [] + if hasattr(object, "func_code"): + # function + fc = object.func_code + expected = fc.co_varnames[0:fc.co_argcount] + elif hasattr(object, 'im_func'): + # method + fc = object.im_func.func_code + expected = fc.co_varnames[1:fc.co_argcount] + elif type(object) in (TypeType,ClassType): + # class + fc = object.__init__.im_func.func_code + expected = fc.co_varnames[1:fc.co_argcount] + elif type(object) is BuiltinFunctionType: + # builtin + fc = None + expected = [] + elif hasattr(object, '__call__'): + # callable object + if type(object.__call__) is MethodType: + fc = object.__call__.im_func.func_code + expected = fc.co_varnames[1:fc.co_argcount] + else: + # abuse of objects to create hierarchy + return apply_fs_data(object.__call__, fs, **args) + + # add form data to args + for field in fs.list: + if field.filename: + val = field + else: + val = field.value + args.setdefault(field.name, []).append(val) + + # replace lists with single values + for arg in args: + if ((type(args[arg]) is ListType) and + (len(args[arg]) == 1)): + args[arg] = args[arg][0] + + # remove unexpected args unless co_flags & 0x08, + # meaning function accepts **kw syntax + if fc is None: + args = {} + elif not (fc.co_flags & 0x08): + for name in args.keys(): + if name not in expected: + del args[name] + + return object(**args) diff --git a/modules/webstyle/lib/webinterface_layout.py b/modules/webstyle/lib/webinterface_layout.py index 6522a1f4d..8f704efdd 100644 --- a/modules/webstyle/lib/webinterface_layout.py +++ b/modules/webstyle/lib/webinterface_layout.py @@ -1,235 +1,235 @@ # -*- coding: utf-8 -*- ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ Global organisation of the application's URLs. This module binds together CDS Invenio's modules and maps them to their corresponding URLs (ie, /search to the websearch modules,...) """ __revision__ = \ "$Id$" from invenio.webinterface_handler import create_handler from invenio.errorlib import register_exception from invenio.webinterface_handler import WebInterfaceDirectory +from invenio import webinterface_handler_wsgi_utils as apache class WebInterfaceDumbPages(WebInterfaceDirectory): """This class implements a dumb interface to use as a fallback in case of errors importing particular module pages.""" _exports = [''] def __call__(self, req, form): try: from invenio.webpage import page except ImportError: page = lambda *args: args[1] - from mod_python import apache req.status = apache.HTTP_INTERNAL_SERVER_ERROR msg = "<p>This functionality is facing a temporary failure.</p>" msg += "<p>The administrator has been informed about the problem.</p>" try: from invenio.config import CFG_SITE_ADMIN_EMAIL msg += """<p>You can contact <code>%s</code> in case of questions.</p>""" % \ CFG_SITE_ADMIN_EMAIL except ImportError: pass msg += """<p>We hope to restore the service soon.</p> <p>Sorry for the inconvenience.</p>""" try: return page('Service failure', msg) except: return msg def _lookup(self, component, path): return WebInterfaceDumbPages(), path index = __call__ try: from invenio.websearch_webinterface import WebInterfaceSearchInterfacePages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceSearchInterfacePages = WebInterfaceDumbPages try: from invenio.websearch_webinterface import WebInterfaceAuthorPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceAuthorPages = WebInterfaceDumbPages try: from invenio.websearch_webinterface import WebInterfaceRSSFeedServicePages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceRSSFeedServicePages = WebInterfaceDumbPages try: from invenio.websearch_webinterface import WebInterfaceUnAPIPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceUnAPIPages = WebInterfaceDumbPages try: from invenio.websubmit_webinterface import websubmit_legacy_getfile except: register_exception(alert_admin=True, subject='EMERGENCY') websubmit_legacy_getfile = WebInterfaceDumbPages try: from invenio.websubmit_webinterface import WebInterfaceSubmitPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceSubmitPages = WebInterfaceDumbPages try: from invenio.websession_webinterface import WebInterfaceYourAccountPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceYourAccountPages = WebInterfaceDumbPages try: from invenio.websession_webinterface import WebInterfaceYourTicketsPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceYourTicketsPages = WebInterfaceDumbPages try: from invenio.websession_webinterface import WebInterfaceYourGroupsPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceYourGroupsPages = WebInterfaceDumbPages try: from invenio.webalert_webinterface import WebInterfaceYourAlertsPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceYourAlertsPages = WebInterfaceDumbPages try: from invenio.webbasket_webinterface import WebInterfaceYourBasketsPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceYourBasketsPages = WebInterfaceDumbPages try: from invenio.webcomment_webinterface import WebInterfaceCommentsPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceCommentsPages = WebInterfaceDumbPages try: from invenio.webmessage_webinterface import WebInterfaceYourMessagesPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceYourMessagesPages = WebInterfaceDumbPages try: from invenio.errorlib_webinterface import WebInterfaceErrorPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceErrorPages = WebInterfaceDumbPages try: from invenio.oai_repository_webinterface import WebInterfaceOAIProviderPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceOAIProviderPages = WebInterfaceDumbPages try: from invenio.webstat_webinterface import WebInterfaceStatsPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceStatsPages = WebInterfaceDumbPages try: from invenio.bibcirculation_webinterface import WebInterfaceYourLoansPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceYourLoansPages = WebInterfaceDumbPages try: from invenio.webjournal_webinterface import WebInterfaceJournalPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceJournalPages = WebInterfaceDumbPages try: from invenio.webdoc_webinterface import WebInterfaceDocumentationPages except: register_exception(alert_admin=True, subject='EMERGENCY') WebInterfaceDocumentationPages = WebInterfaceDumbPages try: from invenio.bibexport_method_fieldexporter_webinterface import \ WebInterfaceFieldExporterPages except: register_exception(alert_admin=True, subject='EMERGENCY') - WebInterfaceDocumentationPages = WebInterfaceDumbPages + WebInterfaceFieldExporterPages = WebInterfaceDumbPages class WebInterfaceInvenio(WebInterfaceSearchInterfacePages): """ The global URL layout is composed of the search API plus all the other modules.""" _exports = WebInterfaceSearchInterfacePages._exports + \ WebInterfaceAuthorPages._exports + [ 'youraccount', 'youralerts', 'yourbaskets', 'yourmessages', 'yourloans', 'yourgroups', 'yourtickets', 'comments', 'error', 'oai2d', ('oai2d.py', 'oai2d'), ('getfile.py', 'getfile'), 'submit', 'rss', 'stats', 'journal', 'help', 'unapi', 'exporter' ] def __init__(self): self.getfile = websubmit_legacy_getfile author = WebInterfaceAuthorPages() submit = WebInterfaceSubmitPages() youraccount = WebInterfaceYourAccountPages() youralerts = WebInterfaceYourAlertsPages() yourbaskets = WebInterfaceYourBasketsPages() yourmessages = WebInterfaceYourMessagesPages() yourloans = WebInterfaceYourLoansPages() yourgroups = WebInterfaceYourGroupsPages() yourtickets = WebInterfaceYourTicketsPages() comments = WebInterfaceCommentsPages() error = WebInterfaceErrorPages() oai2d = WebInterfaceOAIProviderPages() rss = WebInterfaceRSSFeedServicePages() stats = WebInterfaceStatsPages() journal = WebInterfaceJournalPages() help = WebInterfaceDocumentationPages() unapi = WebInterfaceUnAPIPages() exporter = WebInterfaceFieldExporterPages() # This creates the 'handler' function, which will be invoked directly # by mod_python. -handler = create_handler(WebInterfaceInvenio()) +invenio_handler = create_handler(WebInterfaceInvenio()) diff --git a/modules/webstyle/lib/webinterface_tests.py b/modules/webstyle/lib/webinterface_tests.py index 28f18c764..e885aaf4d 100644 --- a/modules/webstyle/lib/webinterface_tests.py +++ b/modules/webstyle/lib/webinterface_tests.py @@ -1,137 +1,126 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """Unit tests for the webinterface module.""" __revision__ = "$Id$" import unittest, sys, cgi from invenio.testutils import make_test_suite, run_test_suite # SLIPPERY SLOPE AHEAD # # Trick mod_python into believing there is already an _apache module # available, which is used only for its parse_qs functions anyway. # # This must be done early, as many imports somehow end up importing # apache in turn, which makes the trick useless. class _FakeApache(object): SERVER_RETURN = 'RETURN' def __init__(self): self.table = None self.log_error = None self.table = None self.config_tree = None self.server_root = None self.mpm_query = lambda dummy: False self.exists_config_define = None self.stat = None self.AP_CONN_UNKNOWN = None self.AP_CONN_CLOSE = None self.AP_CONN_KEEPALIVE = None self.APR_NOFILE = None self.APR_REG = None self.APR_DIR = None self.APR_CHR = None self.APR_BLK = None self.APR_PIPE = None self.APR_LNK = None self.APR_SOCK = None self.APR_UNKFILE = None def parse_qs(self, *args, **kargs): return cgi.parse_qs(*args, **kargs) def parse_qsl(self, *args, **kargs): return cgi.parse_qsl(*args, **kargs) class _FakeReq(object): def __init__(self, q): self.args = q self.method = "GET" return -_current_module = sys.modules.get('mod_python._apache') - -sys.modules['mod_python._apache'] = _FakeApache() - -from mod_python.util import FieldStorage - -if _current_module: - sys.modules['mod_python._apache'] = _current_module -else: - del sys.modules['mod_python._apache'] - - +from invenio.webinterface_handler_wsgi_utils import FieldStorage # -------------------------------------------------- from invenio import webinterface_handler from invenio.config import CFG_SITE_LANG class TestWashArgs(unittest.TestCase): """webinterface - Test for washing of URL query arguments""" def _check(self, query, default, expected): req = _FakeReq(query) form = FieldStorage(req, keep_blank_values=True) result = webinterface_handler.wash_urlargd(form, default) if not 'ln' in expected: expected['ln'] = CFG_SITE_LANG self.failUnlessEqual(result, expected) def test_single_string(self): """ webinterface - check retrieval of a single string field """ default = {'c': (str, 'default')} self._check('c=Test1', default, {'c': 'Test1'}) self._check('d=Test1', default, {'c': 'default'}) self._check('c=Test1&c=Test2', default, {'c': 'Test1'}) def test_string_list(self): """ webinterface - check retrieval of a list of values """ default = {'c': (list, ['default'])} self._check('c=Test1', default, {'c': ['Test1']}) self._check('c=Test1&c=Test2', default, {'c': ['Test1', 'Test2']}) self._check('d=Test1', default, {'c': ['default']}) def test_int_casting(self): """ webinterface - check casting into an int. """ default = {'jrec': (int, -1)} self._check('jrec=12', default, {'jrec': 12}) self._check('jrec=', default, {'jrec': -1}) self._check('jrec=foo', default, {'jrec': -1}) self._check('jrec=foo&jrec=1', default, {'jrec': -1}) self._check('jrec=12&jrec=foo', default, {'jrec': 12}) TEST_SUITE = make_test_suite(TestWashArgs,) if __name__ == "__main__": run_test_suite(TEST_SUITE) diff --git a/modules/webstyle/lib/webstyle_templates.py b/modules/webstyle/lib/webstyle_templates.py index 1037f99d2..2d395dafb 100644 --- a/modules/webstyle/lib/webstyle_templates.py +++ b/modules/webstyle/lib/webstyle_templates.py @@ -1,847 +1,847 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ WebStyle templates. Customize the look of pages of CDS Invenio """ __revision__ = \ "$Id$" import time import cgi import traceback import urllib import sys import string from invenio.config import \ CFG_SITE_LANG, \ CFG_SITE_NAME, \ CFG_SITE_NAME_INTL, \ CFG_SITE_SUPPORT_EMAIL, \ CFG_SITE_SECURE_URL, \ CFG_SITE_URL, \ CFG_VERSION, \ CFG_WEBSTYLE_INSPECT_TEMPLATES from invenio.messages import gettext_set_language, language_list_long from invenio.urlutils import make_canonical_urlargd, create_html_link from invenio.dateutils import convert_datecvs_to_datestruct, \ convert_datestruct_to_dategui from invenio.bibformat import format_record from invenio import template websearch_templates = template.load('websearch') class Template: def tmpl_navtrailbox_body(self, ln, title, previous_links, separator, prolog, epilog): """Create navigation trail box body Parameters: - 'ln' *string* - The language to display - 'title' *string* - page title; - 'previous_links' *string* - the trail content from site title until current page (both ends exlusive) - 'prolog' *string* - HTML code to prefix the navtrail item with - 'epilog' *string* - HTML code to suffix the navtrail item with - 'separator' *string* - HTML code that separates two navtrail items Output: - text containing the navtrail """ # load the right message language _ = gettext_set_language(ln) out = "" if title != CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME): out += create_html_link(CFG_SITE_URL, {'ln': ln}, _("Home"), {'class': 'navtrail'}) if previous_links: if out: out += separator out += previous_links if title: if out: out += separator if title == CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME): # hide site name, print Home instead out += cgi.escape(_("Home")) else: out += cgi.escape(title) return cgi.escape(prolog) + out + cgi.escape(epilog) def tmpl_page(self, req=None, ln=CFG_SITE_LANG, description="", keywords="", userinfobox="", navtrailbox="", pageheaderadd="", boxlefttop="", boxlefttopadd="", boxleftbottom="", boxleftbottomadd="", boxrighttop="", boxrighttopadd="", boxrightbottom="", boxrightbottomadd="", titleprologue="", title="", titleepilogue="", body="", lastupdated=None, pagefooteradd="", uid=0, secure_page_p=0, navmenuid="", metaheaderadd="", rssurl=CFG_SITE_URL+"/rss", show_title_p=True): """Creates a complete page Parameters: - 'ln' *string* - The language to display - 'description' *string* - description goes to the metadata in the header of the HTML page - 'keywords' *string* - keywords goes to the metadata in the header of the HTML page - 'userinfobox' *string* - the HTML code for the user information box - 'navtrailbox' *string* - the HTML code for the navigation trail box - 'pageheaderadd' *string* - additional page header HTML code - 'boxlefttop' *string* - left-top box HTML code - 'boxlefttopadd' *string* - additional left-top box HTML code - 'boxleftbottom' *string* - left-bottom box HTML code - 'boxleftbottomadd' *string* - additional left-bottom box HTML code - 'boxrighttop' *string* - right-top box HTML code - 'boxrighttopadd' *string* - additional right-top box HTML code - 'boxrightbottom' *string* - right-bottom box HTML code - 'boxrightbottomadd' *string* - additional right-bottom box HTML code - 'title' *string* - the title of the page - 'titleprologue' *string* - what to print before page title - 'titleepilogue' *string* - what to print after page title - 'body' *string* - the body of the page - 'lastupdated' *string* - when the page was last updated - 'uid' *int* - user ID - 'pagefooteradd' *string* - additional page footer HTML code - 'secure_page_p' *int* (0 or 1) - are we to use HTTPS friendly page elements or not? - 'navmenuid' *string* - the id of the navigation item to highlight for this page - 'metaheaderadd' *string* - list of further tags to add to the <HEAD></HEAD> part of the page - 'rssurl' *string* - the url of the RSS feed for this page - 'show_title_p' *int* (0 or 1) - do we display the page title in the body of the page? Output: - HTML code of the page """ # load the right message language _ = gettext_set_language(ln) out = self.tmpl_pageheader(req, ln = ln, headertitle = title, description = description, keywords = keywords, metaheaderadd = metaheaderadd, userinfobox = userinfobox, navtrailbox = navtrailbox, pageheaderadd = pageheaderadd, secure_page_p = secure_page_p, navmenuid=navmenuid, rssurl=rssurl) + """ <div class="pagebody"> <div class="pagebodystripeleft"> <div class="pageboxlefttop">%(boxlefttop)s</div> <div class="pageboxlefttopadd">%(boxlefttopadd)s</div> <div class="pageboxleftbottomadd">%(boxleftbottomadd)s</div> <div class="pageboxleftbottom">%(boxleftbottom)s</div> </div> <div class="pagebodystriperight"> <div class="pageboxrighttop">%(boxrighttop)s</div> <div class="pageboxrighttopadd">%(boxrighttopadd)s</div> <div class="pageboxrightbottomadd">%(boxrightbottomadd)s</div> <div class="pageboxrightbottom">%(boxrightbottom)s</div> </div> <div class="pagebodystripemiddle"> %(titleprologue)s %(title)s %(titleepilogue)s %(body)s </div> </div> """ % { 'boxlefttop' : boxlefttop, 'boxlefttopadd' : boxlefttopadd, 'boxleftbottom' : boxleftbottom, 'boxleftbottomadd' : boxleftbottomadd, 'boxrighttop' : boxrighttop, 'boxrighttopadd' : boxrighttopadd, 'boxrightbottom' : boxrightbottom, 'boxrightbottomadd' : boxrightbottomadd, 'titleprologue' : titleprologue, 'title' : (title and show_title_p) and '<h1 class="headline">' + cgi.escape(title) + '</h1>' or '', 'titleepilogue' : titleepilogue, 'body' : body, } + self.tmpl_pagefooter(req, ln = ln, lastupdated = lastupdated, pagefooteradd = pagefooteradd) return out def tmpl_pageheader(self, req, ln=CFG_SITE_LANG, headertitle="", description="", keywords="", userinfobox="", navtrailbox="", pageheaderadd="", uid=0, secure_page_p=0, navmenuid="admin", metaheaderadd="", rssurl=CFG_SITE_URL+"/rss"): """Creates a page header Parameters: - 'ln' *string* - The language to display - 'headertitle' *string* - the second part of the page HTML title - 'description' *string* - description goes to the metadata in the header of the HTML page - 'keywords' *string* - keywords goes to the metadata in the header of the HTML page - 'userinfobox' *string* - the HTML code for the user information box - 'navtrailbox' *string* - the HTML code for the navigation trail box - 'pageheaderadd' *string* - additional page header HTML code - 'uid' *int* - user ID - 'secure_page_p' *int* (0 or 1) - are we to use HTTPS friendly page elements or not? - 'navmenuid' *string* - the id of the navigation item to highlight for this page - 'metaheaderadd' *string* - list of further tags to add to the <HEAD></HEAD> part of the page - 'rssurl' *string* - the url of the RSS feed for this page Output: - HTML code of the page headers """ # load the right message language _ = gettext_set_language(ln) if CFG_WEBSTYLE_INSPECT_TEMPLATES: inspect_templates_message = """ <table width="100%%" cellspacing=0 cellpadding=2 border=0> <tr bgcolor="#aa0000"> <td width="100%%"> <font color="#ffffff"> <strong> <small> CFG_WEBSTYLE_INSPECT_TEMPLATES debugging mode is enabled. Please hover your mouse pointer over any region on the page to see which template function generated it. </small> </strong> </font> </td> </tr> </table> """ else: inspect_templates_message = "" out = """\ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>%(headertitle)s - %(sitename)s</title> <link rev="made" href="mailto:%(sitesupportemail)s" /> <link rel="stylesheet" href="%(cssurl)s/img/cds.css" type="text/css" /> <link rel="alternate" type="application/rss+xml" title="%(sitename)s RSS" href="%(rssurl)s" /> <link rel="unapi-server" type="application/xml" title="unAPI" href="%(unAPIurl)s" /> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta http-equiv="Content-Language" content="%(ln)s" /> <meta name="description" content="%(description)s" /> <meta name="keywords" content="%(keywords)s" /> %(metaheaderadd)s </head> <body> <div class="pageheader"> %(inspect_templates_message)s <!-- replaced page header --> <div style="background-image: url(%(cssurl)s/img/header_background.gif);"> <table class="headerbox"> <tr> <td class="headerboxbodylogo"> %(sitename)s </td> <td align="right" valign="top" class="userinfoboxbody"> %(userinfobox)s </td> </tr> <tr> <td class="headerboxbody" valign="bottom" align="left"> <table class="headermodulebox" width="100%%"><tr><td class="headermoduleboxbodyblanklast"> </td></tr></table> </td> <td class="headerboxbody" valign="bottom" align="left"> <table class="headermodulebox"> <tr> <td class="headermoduleboxbodyblank"> </td> <td class="headermoduleboxbodyblank"> </td> <td class="headermoduleboxbody%(search_selected)s"> <a class="header%(search_selected)s" href="%(siteurl)s/?ln=%(ln)s">%(msg_search)s</a> </td> <td class="headermoduleboxbodyblank"> </td> <td class="headermoduleboxbody%(submit_selected)s"> <a class="header%(submit_selected)s" href="%(siteurl)s/submit?ln=%(ln)s">%(msg_submit)s</a> </td> <td class="headermoduleboxbodyblank"> </td> <td class="headermoduleboxbody%(personalize_selected)s"> <a class="header%(personalize_selected)s" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(msg_personalize)s</a> </td> <td class="headermoduleboxbodyblank"> </td> <td class="headermoduleboxbody%(help_selected)s"> <a class="header%(help_selected)s" href="%(siteurl)s/help/%(langlink)s">%(msg_help)s</a> </td> <td class="headermoduleboxbodyblanklast"> </td> </tr> </table> </td> </tr> </table> </div> <table class="navtrailbox"> <tr> <td class="navtrailboxbody"> %(navtrailbox)s </td> </tr> </table> <!-- end replaced page header --> %(pageheaderadd)s </div> """ % { 'siteurl' : CFG_SITE_URL, 'sitesecureurl' : CFG_SITE_SECURE_URL, 'cssurl' : secure_page_p and CFG_SITE_SECURE_URL or CFG_SITE_URL, 'rssurl': rssurl, 'ln' : ln, 'langlink': '?ln=' + ln, 'sitename' : CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME), 'headertitle' : cgi.escape(headertitle), 'sitesupportemail' : CFG_SITE_SUPPORT_EMAIL, 'description' : cgi.escape(description), 'keywords' : cgi.escape(keywords), 'metaheaderadd' : metaheaderadd, 'userinfobox' : userinfobox, 'navtrailbox' : navtrailbox, 'pageheaderadd' : pageheaderadd, 'search_selected': navmenuid == 'search' and "selected" or "", 'submit_selected': navmenuid == 'submit' and "selected" or "", 'personalize_selected': navmenuid.startswith('your') and "selected" or "", 'help_selected': navmenuid == 'help' and "selected" or "", 'msg_search' : _("Search"), 'msg_submit' : _("Submit"), 'msg_personalize' : _("Personalize"), 'msg_help' : _("Help"), 'languagebox' : self.tmpl_language_selection_box(req, ln), 'unAPIurl' : cgi.escape('%s/unapi' % CFG_SITE_URL), 'inspect_templates_message' : inspect_templates_message } return out def tmpl_pagefooter(self, req=None, ln=CFG_SITE_LANG, lastupdated=None, pagefooteradd=""): """Creates a page footer Parameters: - 'ln' *string* - The language to display - 'lastupdated' *string* - when the page was last updated - 'pagefooteradd' *string* - additional page footer HTML code Output: - HTML code of the page headers """ # load the right message language _ = gettext_set_language(ln) if lastupdated: if lastupdated.startswith("$Date: ") or \ lastupdated.startswith("$Id: "): lastupdated = convert_datestruct_to_dategui(\ convert_datecvs_to_datestruct(lastupdated), ln=ln) msg_lastupdated = _("Last updated") + ": " + lastupdated else: msg_lastupdated = "" out = """ <div class="pagefooter"> %(pagefooteradd)s <!-- replaced page footer --> <div class="pagefooterstripeleft"> %(sitename)s :: <a class="footer" href="%(siteurl)s/?ln=%(ln)s">%(msg_search)s</a> :: <a class="footer" href="%(siteurl)s/submit?ln=%(ln)s">%(msg_submit)s</a> :: <a class="footer" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(msg_personalize)s</a> :: <a class="footer" href="%(siteurl)s/help/%(langlink)s">%(msg_help)s</a> <br /> %(msg_poweredby)s <a class="footer" href="http://cdsware.cern.ch/">CDS Invenio</a> v%(version)s <br /> %(msg_maintainedby)s <a class="footer" href="mailto:%(sitesupportemail)s">%(sitesupportemail)s</a> <br /> %(msg_lastupdated)s </div> <div class="pagefooterstriperight"> %(languagebox)s </div> <!-- replaced page footer --> </div> </body> </html> """ % { 'siteurl' : CFG_SITE_URL, 'sitesecureurl' : CFG_SITE_SECURE_URL, 'ln' : ln, 'langlink': '?ln=' + ln, 'sitename' : CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME), 'sitesupportemail' : CFG_SITE_SUPPORT_EMAIL, 'msg_search' : _("Search"), 'msg_submit' : _("Submit"), 'msg_personalize' : _("Personalize"), 'msg_help' : _("Help"), 'msg_poweredby' : _("Powered by"), 'msg_maintainedby' : _("Maintained by"), 'msg_lastupdated' : msg_lastupdated, 'languagebox' : self.tmpl_language_selection_box(req, ln), 'version' : CFG_VERSION, 'pagefooteradd' : pagefooteradd, } return out def tmpl_language_selection_box(self, req, language=CFG_SITE_LANG): """Take URLARGS and LANGUAGE and return textual language selection box for the given page. Parameters: - 'req' - The mod_python request object - 'language' *string* - The selected language """ # load the right message language _ = gettext_set_language(language) # Work on a copy in order not to bork the arguments of the caller argd = {} if req and req.args: argd.update(cgi.parse_qs(req.args)) parts = [] for (lang, lang_namelong) in language_list_long(): if lang == language: parts.append('<span class="langinfo">%s</span>' % lang_namelong) else: # Update the 'ln' argument in the initial request argd['ln'] = lang if req and req.uri: args = urllib.quote(req.uri, '/:?') + make_canonical_urlargd(argd, {}) else: args = "" parts.append(create_html_link(args, {}, lang_namelong, {'class': "langinfo"})) if len(parts) > 1: return _("This site is also available in the following languages:") + \ "<br />" + ' '.join(parts) else: ## There is only one (or zero?) languages configured, ## so there so need to display language alternatives. return "" def tmpl_error_box(self, ln, title, verbose, req, errors): """Produces an error box. Parameters: - 'title' *string* - The title of the error box - 'ln' *string* - The selected language - 'verbose' *bool* - If lots of information should be displayed - 'req' *object* - the request object - 'errors' list of tuples (error_code, error_message) """ # load the right message language _ = gettext_set_language(ln) info_not_available = _("N/A") if title is None: if errors: title = _("Error") + ': %s' % errors[0][1] else: title = _("Internal Error") browser_s = _("Browser") if req: try: if req.headers_in.has_key('User-Agent'): browser_s += ': ' + req.headers_in['User-Agent'] else: browser_s += ': ' + info_not_available host_s = req.hostname page_s = req.unparsed_uri - client_s = req.connection.remote_ip + client_s = req.remote_ip except: # FIXME: bad except browser_s += ': ' + info_not_available host_s = page_s = client_s = info_not_available else: browser_s += ': ' + info_not_available host_s = page_s = client_s = info_not_available error_s = '' sys_error_s = '' traceback_s = '' if verbose >= 1: if sys.exc_info()[0]: sys_error_s = '\n' + _("System Error") + ': %s %s\n' % \ (sys.exc_info()[0], sys.exc_info()[1]) if errors: errs = '' for error_tuple in errors: try: errs += "%s%s : %s\n " % (' '*6, error_tuple[0], error_tuple[1]) except: errs += "%s%s\n" % (' '*6, error_tuple) errs = errs[6:-2] # get rid of trainling ',' error_s = _("Error") + ': %s")' % errs + "\n" else: error_s = _("Error") + ': ' + info_not_available if verbose >= 9: traceback_s = '\n' + _("Traceback") + ': \n%s' % \ string.join(traceback.format_tb(sys.exc_info()[2]), "\n") out = """ <table class="errorbox"> <thead> <tr> <th class="errorboxheader"> <p> %(title)s %(sys1)s %(sys2)s</p> </th> </tr> </thead> <tbody> <tr> <td class="errorboxbody"> <p>%(contact)s</p> <blockquote><pre> URI: http://%(host)s%(page)s %(time_label)s: %(time)s %(browser)s %(client_label)s: %(client)s %(error)s%(sys_error)s%(traceback)s </pre></blockquote> </td> </tr> <tr> <td> <form action="%(siteurl)s/error/send" method="post"> %(send_error_label)s <input class="adminbutton" type="submit" value="%(send_label)s" /> <input type="hidden" name="header" value="%(title)s %(sys1)s %(sys2)s" /> <input type="hidden" name="url" value="URI: http://%(host)s%(page)s" /> <input type="hidden" name="time" value="Time: %(time)s" /> <input type="hidden" name="browser" value="%(browser)s" /> <input type="hidden" name="client" value="Client: %(client)s" /> <input type="hidden" name="error" value="%(error)s" /> <input type="hidden" name="sys_error" value="%(sys_error)s" /> <input type="hidden" name="traceback" value="%(traceback)s" /> <input type="hidden" name="referer" value="%(referer)s" /> </form> </td> </tr> </tbody> </table> """ % { 'title' : cgi.escape(title).replace('"', '"'), 'time_label': _("Time"), 'client_label': _("Client"), 'send_error_label': \ _("Please send an error report to the administrator."), 'send_label': _("Send error report"), 'sys1' : cgi.escape(str((sys.exc_info()[0] or ''))).replace('"', '"'), 'sys2' : cgi.escape(str((sys.exc_info()[1] or ''))).replace('"', '"'), 'contact' : \ _("Please contact %s quoting the following information:") % \ ('<a href="mailto:' + urllib.quote(CFG_SITE_SUPPORT_EMAIL) +'">' + \ CFG_SITE_SUPPORT_EMAIL + '</a>'), 'host' : cgi.escape(host_s), 'page' : cgi.escape(page_s), 'time' : time.strftime("%d/%b/%Y:%H:%M:%S %z"), 'browser' : cgi.escape(browser_s).replace('"', '"'), 'client' : cgi.escape(client_s).replace('"', '"'), 'error' : cgi.escape(error_s).replace('"', '"'), 'traceback' : cgi.escape(traceback_s).replace('"', '"'), 'sys_error' : cgi.escape(sys_error_s).replace('"', '"'), 'siteurl' : CFG_SITE_URL, 'referer' : page_s!=info_not_available and \ ("http://" + host_s + page_s) or \ info_not_available } return out def detailed_record_container_top(self, recid, tabs, ln=CFG_SITE_LANG, show_similar_rec_p=True, creationdate=None, modificationdate=None, show_short_rec_p=True): """Prints the box displayed in detailed records pages, with tabs at the top. Returns content as it is if the number of tabs for this record is smaller than 2 Parameters: - recid *int* - the id of the displayed record - tabs ** - the tabs displayed at the top of the box. - ln *string* - the language of the page in which the box is displayed - show_similar_rec_p *bool* print 'similar records' link in the box - creationdate *string* - the creation date of the displayed record - modificationdate *string* - the last modification date of the displayed record - show_short_rec_p *boolean* - prints a very short version of the record as reminder. """ # If no tabs, returns nothing if len(tabs) <= 1: return '' # load the right message language _ = gettext_set_language(ln) # Build the tabs at the top of the page out_tabs = '' if len(tabs) > 1: first_tab = True for (label, url, selected, enabled) in tabs: css_class = [] if selected: css_class.append('on') if first_tab: css_class.append('first') first_tab = False if not enabled: css_class.append('disabled') css_class = ' class="%s"' % ' '.join(css_class) if not enabled: out_tabs += '<li%(class)s><a>%(label)s</a></li>' % \ {'class':css_class, 'label':label} else: out_tabs += '<li%(class)s><a href="%(url)s">%(label)s</a></li>' % \ {'class':css_class, 'url':url, 'label':label} if out_tabs != '': out_tabs = ''' <div class="detailedrecordtabs"> <div> <ul class="detailedrecordtabs">%s</ul> <div id="tabsSpacer" style="clear:both;height:0px"> </div></div> </div>''' % out_tabs # Add the clip icon and the brief record reminder if necessary record_brief = '' if show_short_rec_p: record_brief = format_record(recID=recid, of='hs', ln=ln) record_brief = '''<div id="detailedrecordshortreminder"> <div id="clip"> </div> <div id="HB"> %(record_brief)s </div> </div> <div style="clear:both;height:1px"> </div> ''' % {'record_brief': record_brief} # Print the content out = """ <div class="detailedrecordbox"> %(tabs)s <div class="detailedrecordboxcontent"> <div class="top-left-folded"></div> <div class="top-right-folded"></div> <div class="inside"> <!--<div style="height:0.1em;"> </div> <p class="notopgap"> </p>--> %(record_brief)s """ % {'tabs':out_tabs, 'record_brief':record_brief} return out def detailed_record_container_bottom(self, recid, tabs, ln=CFG_SITE_LANG, show_similar_rec_p=True, creationdate=None, modificationdate=None, show_short_rec_p=True): """Prints the box displayed in detailed records pages, with tabs at the top. Returns content as it is if the number of tabs for this record is smaller than 2 Parameters: - recid *int* - the id of the displayed record - tabs ** - the tabs displayed at the top of the box. - ln *string* - the language of the page in which the box is displayed - show_similar_rec_p *bool* print 'similar records' link in the box - creationdate *string* - the creation date of the displayed record - modificationdate *string* - the last modification date of the displayed record - show_short_rec_p *boolean* - prints a very short version of the record as reminder. """ # If no tabs, returns nothing if len(tabs) <= 1: return '' # load the right message language _ = gettext_set_language(ln) out = """ <p class="nobottomgap" > </p> </div> <div class="bottom-left-folded">%(dates)s</div> <div class="bottom-right-folded" style="text-align:right"><span class="moreinfo" style="margin-right:25px">%(similar)s</span></div> </div> </div> <br/> """ % {'similar':create_html_link( websearch_templates.build_search_url(p='recid:%d' % \ recid, rm='wrd', ln=ln), {}, _("Similar records"), {'class': "moreinfo"}), 'dates':creationdate and '<div class="recordlastmodifiedbox" style="float:left;position:relative;margin-left:1px"> %(dates)s</div>' % { 'dates': _("Record created %(x_date_creation)s, last modified %(x_date_modification)s") % \ {'x_date_creation': creationdate, 'x_date_modification': modificationdate}, } or '' } return out def detailed_record_mini_panel(self, recid, ln=CFG_SITE_LANG, format='hd', files='', reviews='', actions=''): """Displays the actions dock at the bottom of the detailed record pages. Parameters: - recid *int* - the id of the displayed record - ln *string* - interface language code - format *string* - the format used to display the record - files *string* - the small panel representing the fulltext - reviews *string* - the small panel representing the reviews - actions *string* - the small panel representing the possible user's action """ # load the right message language _ = gettext_set_language(ln) out = """ <br /> <div class="detailedrecordminipanel"> <div class="top-left"></div><div class="top-right"></div> <div class="inside"> <div id="detailedrecordminipanelfile" style="width:33%%;float:left;text-align:center;margin-top:0"> %(files)s </div> <div id="detailedrecordminipanelreview" style="width:30%%;float:left;text-align:center"> %(reviews)s </div> <div id="detailedrecordminipanelactions" style="width:36%%;float:right;text-align:right;"> %(actions)s </div> <div style="clear:both;margin-bottom: 0;"></div> </div> <div class="bottom-left"></div><div class="bottom-right"></div> </div> """ % { 'siteurl': CFG_SITE_URL, 'ln':ln, 'recid':recid, 'files': files, 'reviews':reviews, 'actions': actions, } return out diff --git a/modules/websubmit/lib/bibdocfile.py b/modules/websubmit/lib/bibdocfile.py index 7b1a56f4b..238995c72 100644 --- a/modules/websubmit/lib/bibdocfile.py +++ b/modules/websubmit/lib/bibdocfile.py @@ -1,2528 +1,2525 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. __revision__ = "$Id$" import os import re import shutil import filecmp import time import random import socket import urllib2 import urllib import tempfile import cPickle import base64 import binascii import cgi import sys if sys.hexversion < 0x2060000: from md5 import md5 else: from hashlib import md5 try: import magic CFG_HAS_MAGIC = True except ImportError: CFG_HAS_MAGIC = False from datetime import datetime from mimetypes import MimeTypes from thread import get_ident -try: - from mod_python import apache -except ImportError: - pass +from invenio import webinterface_handler_wsgi_utils as apache ## Let's set a reasonable timeout for URL request (e.g. FFT) socket.setdefaulttimeout(40) if sys.hexversion < 0x2040000: # pylint: disable-msg=W0622 from sets import Set as set # pylint: enable-msg=W0622 from invenio.shellutils import escape_shell_arg from invenio.dbquery import run_sql, DatabaseError, blob_to_string from invenio.errorlib import register_exception from invenio.bibrecord import record_get_field_instances, \ field_get_subfield_values, field_get_subfield_instances, \ encode_for_xml from invenio.access_control_engine import acc_authorize_action from invenio.config import CFG_SITE_LANG, CFG_SITE_URL, \ CFG_WEBDIR, CFG_WEBSUBMIT_FILEDIR,\ CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS, \ CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT, CFG_SITE_SECURE_URL, \ CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS, \ CFG_TMPDIR, CFG_PATH_MD5SUM, \ CFG_WEBSUBMIT_STORAGEDIR from invenio.bibformat import format_record import invenio.template websubmit_templates = invenio.template.load('websubmit') websearch_templates = invenio.template.load('websearch') CFG_BIBDOCFILE_MD5_THRESHOLD = 256 * 1024 CFG_BIBDOCFILE_MD5_BUFFER = 1024 * 1024 CFG_BIBDOCFILE_STRONG_FORMAT_NORMALIZATION = False KEEP_OLD_VALUE = 'KEEP-OLD-VALUE' _mimes = MimeTypes(strict=False) _mimes.suffix_map.update({'.tbz2' : '.tar.bz2'}) _mimes.encodings_map.update({'.bz2' : 'bzip2'}) _magic_cookies = {} def get_magic_cookies(): """Return a tuple of magic object. ... not real magic. Just see: man file(1)""" thread_id = get_ident() if thread_id not in _magic_cookies: _magic_cookies[thread_id] = { magic.MAGIC_NONE : magic.open(magic.MAGIC_NONE), magic.MAGIC_COMPRESS : magic.open(magic.MAGIC_COMPRESS), magic.MAGIC_MIME : magic.open(magic.MAGIC_MIME), magic.MAGIC_COMPRESS + magic.MAGIC_MIME : magic.open(magic.MAGIC_COMPRESS + magic.MAGIC_MIME) } for key in _magic_cookies[thread_id].keys(): _magic_cookies[thread_id][key].load() return _magic_cookies[thread_id] def _generate_extensions(): _tmp_extensions = _mimes.encodings_map.keys() + \ _mimes.suffix_map.keys() + \ _mimes.types_map[1].keys() + \ CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS extensions = [] for ext in _tmp_extensions: if ext.startswith('.'): extensions.append(ext) else: extensions.append('.' + ext) extensions.sort() extensions.reverse() extensions = set([ext.lower() for ext in extensions]) extensions = '\\' + '$|\\'.join(extensions) + '$' extensions = extensions.replace('+', '\\+') return re.compile(extensions, re.I) _extensions = _generate_extensions() class InvenioWebSubmitFileError(Exception): pass def file_strip_ext(afile, skip_version=False): """Strip in the best way the extension from a filename""" if skip_version: afile = afile.split(';')[0] nextfile = _extensions.sub('', afile) if nextfile == afile: nextfile = os.path.splitext(afile)[0] while nextfile != afile: afile = nextfile nextfile = _extensions.sub('', afile) return nextfile def normalize_format(format): """Normalize the format.""" if format and format[0] != '.': format = '.' + format if CFG_BIBDOCFILE_STRONG_FORMAT_NORMALIZATION: if format not in ('.Z', '.H', '.C', '.CC'): format = format.lower() format = { '.jpg' : '.jpeg', '.htm' : '.html', '.tif' : '.tiff' }.get(format, format) return format _docname_re = re.compile(r'[^-\w.]*') def normalize_docname(docname): """Normalize the docname (only digit and alphabetic letters and underscore are allowed)""" #return _docname_re.sub('', docname) return docname def normalize_version(version): """Normalize the version.""" try: int(version) except ValueError: if version.lower().strip() == 'all': return 'all' else: return '' return str(version) def decompose_file(afile, skip_version=False): """Decompose a file into dirname, basename and extension. Note that if provided with a URL, the scheme in front will be part of the dirname.""" if skip_version: version = afile.split(';')[-1] try: int(version) afile = afile[:-len(version)-1] except ValueError: pass basename = os.path.basename(afile) dirname = afile[:-len(basename)-1] base = file_strip_ext(basename) extension = basename[len(base) + 1:] if extension: extension = '.' + extension return (dirname, base, extension) def decompose_file_with_version(afile): """Decompose a file into dirname, basename, extension and version. In case version does not exist it will raise ValueError. Note that if provided with a URL, the scheme in front will be part of the dirname.""" version_str = afile.split(';')[-1] version = int(version_str) afile = afile[:-len(version_str)-1] basename = os.path.basename(afile) dirname = afile[:-len(basename)-1] base = file_strip_ext(basename) extension = basename[len(base) + 1:] if extension: extension = '.' + extension return (dirname, base, extension, version) def propose_next_docname(docname): """Propose a next docname docname""" if '_' in docname: split_docname = docname.split('_') try: split_docname[-1] = str(int(split_docname[-1]) + 1) docname = '_'.join(split_docname) except ValueError: docname += '_1' else: docname += '_1' return docname class BibRecDocs: """this class represents all the files attached to one record""" def __init__(self, recid, deleted_too=False, human_readable=False): self.id = recid self.human_readable = human_readable self.deleted_too = deleted_too self.bibdocs = [] self.build_bibdoc_list() def __repr__(self): if self.deleted_too: return 'BibRecDocs(%s, True)' % self.id else: return 'BibRecDocs(%s)' % self.id def __str__(self): out = '%i::::total bibdocs attached=%i\n' % (self.id, len(self.bibdocs)) out += '%i::::total size latest version=%s\n' % (self.id, nice_size(self.get_total_size_latest_version())) out += '%i::::total size all files=%s\n' % (self.id, nice_size(self.get_total_size())) for bibdoc in self.bibdocs: out += str(bibdoc) return out def empty_p(self): """Return True if the bibrec is empty, i.e. it has no bibdocs connected.""" return len(self.bibdocs) == 0 def deleted_p(self): """Return True if the bibrec has been deleted.""" from invenio.search_engine import record_exists return record_exists(self.id) == -1 def get_xml_8564(self): """Return a snippet of XML representing the 8564 corresponding to the current state""" from invenio.search_engine import get_record out = '' record = get_record(self.id) fields = record_get_field_instances(record, '856', '4', ' ') for field in fields: url = field_get_subfield_values(field, 'u') if not bibdocfile_url_p(url): out += '\t<datafield tag="856" ind1="4" ind2=" ">\n' for subfield, value in field_get_subfield_instances(field): out += '\t\t<subfield code="%s">%s</subfield>\n' % (subfield, encode_for_xml(value)) out += '\t</datafield>\n' for afile in self.list_latest_files(): out += '\t<datafield tag="856" ind1="4" ind2=" ">\n' url = afile.get_url() description = afile.get_description() comment = afile.get_comment() if url: out += '\t\t<subfield code="u">%s</subfield>\n' % encode_for_xml(url) if description: out += '\t\t<subfield code="y">%s</subfield>\n' % encode_for_xml(description) if comment: out += '\t\t<subfield code="z">%s</subfield>\n' % encode_for_xml(comment) out += '\t</datafield>\n' for bibdoc in self.bibdocs: icon = bibdoc.get_icon() if icon: icon = icon.list_all_files() if icon: out += '\t<datafield tag="856" ind1="4" ind2=" ">\n' out += '\t\t<subfield code="q">%s</subfield>\n' % encode_for_xml(icon[0].get_url()) out += '\t\t<subfield code="x">icon</subfield>\n' out += '\t</datafield>\n' return out def get_total_size_latest_version(self): """Return the total size used on disk of all the files belonging to this record and corresponding to the latest version.""" size = 0 for bibdoc in self.bibdocs: size += bibdoc.get_total_size_latest_version() return size def get_total_size(self): """Return the total size used on disk of all the files belonging to this record of any version.""" size = 0 for bibdoc in self.bibdocs: size += bibdoc.get_total_size() return size def build_bibdoc_list(self): """This function must be called everytime a bibdoc connected to this recid is added, removed or modified. """ self.bibdocs = [] if self.deleted_too: res = run_sql("""SELECT id_bibdoc, type FROM bibrec_bibdoc JOIN bibdoc ON id=id_bibdoc WHERE id_bibrec=%s ORDER BY docname ASC""", (self.id,)) else: res = run_sql("""SELECT id_bibdoc, type FROM bibrec_bibdoc JOIN bibdoc ON id=id_bibdoc WHERE id_bibrec=%s AND status<>'DELETED' ORDER BY docname ASC""", (self.id,)) for row in res: cur_doc = BibDoc(docid=row[0], recid=self.id, doctype=row[1], human_readable=self.human_readable) self.bibdocs.append(cur_doc) def list_bibdocs(self, doctype=''): """Returns the list all bibdocs object belonging to a recid. If doctype is set, it returns just the bibdocs of that doctype. """ if not doctype: return self.bibdocs else: return [bibdoc for bibdoc in self.bibdocs if doctype == bibdoc.doctype] def get_bibdoc_names(self, doctype=''): """Returns the names of the files associated with the bibdoc of a paritcular doctype""" return [bibdoc.docname for bibdoc in self.list_bibdocs(doctype)] def check_file_exists(self, path): """Returns 1 if the recid has a file identical to the one stored in path.""" size = os.path.getsize(path) # Let's consider all the latest files files = self.list_latest_files() # Let's consider all the latest files with same size potential = [afile for afile in files if afile.get_size() == size] if potential: checksum = calculate_md5(path) # Let's consider all the latest files with the same size and the # same checksum potential = [afile for afile in potential if afile.get_checksum() == checksum] if potential: potential = [afile for afile in potential if filecmp.cmp(afile.get_full_path(), path)] if potential: return True else: # Gosh! How unlucky, same size, same checksum but not same # content! pass return False def propose_unique_docname(self, docname): """Propose a unique docname.""" docname = normalize_docname(docname) goodname = docname i = 1 while goodname in self.get_bibdoc_names(): i += 1 goodname = "%s_%s" % (docname, i) return goodname def merge_bibdocs(self, docname1, docname2): """This method merge docname2 into docname1. Given all the formats of the latest version of docname2 the files are added as new formats into docname1. Docname2 is marked as deleted. This method fails if at least one format in docname2 already exists in docname1. (In this case the two bibdocs are preserved) Comments and descriptions are also copied and if docname2 has an icon and docname1 has not, the icon is imported. If docname2 has a restriction(status) and docname1 has not the restriction is imported.""" bibdoc1 = self.get_bibdoc(docname1) bibdoc2 = self.get_bibdoc(docname2) ## Check for possibility for bibdocfile in bibdoc2.list_latest_files(): format = bibdocfile.get_format() if bibdoc1.format_already_exists_p(format): raise InvenioWebSubmitFileError('Format %s already exists in bibdoc %s of record %s. It\'s impossible to merge bibdoc %s into it.' % (format, docname1, self.id, docname2)) ## Importing Icon if needed. icon1 = bibdoc1.get_icon() icon2 = bibdoc2.get_icon() if icon2 is not None and icon1 is None: icon = icon2.list_latest_files()[0] bibdoc1.add_icon(icon.get_full_path(), format=icon.get_format()) ## Importing restriction if needed. restriction1 = bibdoc1.get_status() restriction2 = bibdoc2.get_status() if restriction2 and not restriction1: bibdoc1.set_status(restriction2) ## Importing formats for bibdocfile in bibdoc2.list_latest_files(): format = bibdocfile.get_format() comment = bibdocfile.get_comment() description = bibdocfile.get_description() bibdoc1.add_file_new_format(bibdocfile.get_full_path(), description=description, comment=comment, format=format) ## Finally deleting old bibdoc2 bibdoc2.delete() self.build_bibdoc_list() def get_docid(self, docname): """Returns the docid corresponding to the given docname, if the docname is valid. """ for bibdoc in self.bibdocs: if bibdoc.docname == docname: return bibdoc.id raise InvenioWebSubmitFileError, "Recid '%s' is not connected with a " \ "docname '%s'" % (self.id, docname) def get_docname(self, docid): """Returns the docname corresponding to the given docid, if the docid is valid. """ for bibdoc in self.bibdocs: if bibdoc.id == docid: return bibdoc.docname raise InvenioWebSubmitFileError, "Recid '%s' is not connected with a " \ "docid '%s'" % (self.id, docid) def has_docname_p(self, docname): """Return True if a bibdoc with a particular docname belong to this record.""" for bibdoc in self.bibdocs: if bibdoc.docname == docname: return True return False def get_bibdoc(self, docname): """Returns the bibdoc with a particular docname associated with this recid""" for bibdoc in self.bibdocs: if bibdoc.docname == docname: return bibdoc raise InvenioWebSubmitFileError, "Recid '%s' is not connected with " \ " docname '%s'" % (self.id, docname) def delete_bibdoc(self, docname): """Deletes a docname associated with the recid.""" for bibdoc in self.bibdocs: if bibdoc.docname == docname: bibdoc.delete() self.build_bibdoc_list() def add_bibdoc(self, doctype="Main", docname='file', never_fail=False): """Creates a new bibdoc associated with the recid, with a file called docname and a particular doctype. It returns the bibdoc object which was just created. If never_fail is True then the system will always be able to create a bibdoc. """ try: docname = normalize_docname(docname) if never_fail: docname = self.propose_unique_docname(docname) if docname in self.get_bibdoc_names(): raise InvenioWebSubmitFileError, "%s has already a bibdoc with docname %s" % (self.id, docname) else: bibdoc = BibDoc(recid=self.id, doctype=doctype, docname=docname, human_readable=self.human_readable) self.build_bibdoc_list() return bibdoc except Exception, e: register_exception() raise InvenioWebSubmitFileError(str(e)) def add_new_file(self, fullpath, doctype="Main", docname=None, never_fail=False, description=None, comment=None, format=None): """Adds a new file with the following policy: if the docname is not set it is retrieved from the name of the file. If bibdoc with the given docname doesn't exist, it is created and the file is added to it. It it exist but it doesn't contain the format that is being added, the new format is added. If the format already exists then if never_fail is True a new bibdoc is created with a similar name but with a progressive number as a suffix and the file is added to it. The elaborated bibdoc is returned. """ if docname is None: docname = decompose_file(fullpath)[1] if format is None: format = decompose_file(fullpath)[2] docname = normalize_docname(docname) try: bibdoc = self.get_bibdoc(docname) except InvenioWebSubmitFileError: # bibdoc doesn't already exists! bibdoc = self.add_bibdoc(doctype, docname, False) bibdoc.add_file_new_version(fullpath, description=description, comment=comment, format=format) self.build_bibdoc_list() else: try: bibdoc.add_file_new_format(fullpath, description=description, comment=comment, format=format) self.build_bibdoc_list() except InvenioWebSubmitFileError, e: # Format already exist! if never_fail: bibdoc = self.add_bibdoc(doctype, docname, True) bibdoc.add_file_new_version(fullpath, description=description, comment=comment, format=format) self.build_bibdoc_list() else: raise e return bibdoc def add_new_version(self, fullpath, docname=None, description=None, comment=None, format=None, hide_previous_versions=False): """Adds a new fullpath file to an already existent docid making the previous files associated with the same bibdocids obsolete. It returns the bibdoc object. """ if docname is None: docname = decompose_file(fullpath)[1] if format is None: format = decompose_file(fullpath)[2] bibdoc = self.get_bibdoc(docname=docname) bibdoc.add_file_new_version(fullpath, description=description, comment=comment, format=format, hide_previous_versions=hide_previous_versions) self.build_bibdoc_list() return bibdoc def add_new_format(self, fullpath, docname=None, description=None, comment=None, format=None): """Adds a new format for a fullpath file to an already existent docid along side already there files. It returns the bibdoc object. """ if docname is None: docname = decompose_file(fullpath)[1] if format is None: format = decompose_file(fullpath)[2] bibdoc = self.get_bibdoc(docname=docname) bibdoc.add_file_new_format(fullpath, description=description, comment=comment, format=format) self.build_bibdoc_list() return bibdoc def list_latest_files(self, doctype=''): """Returns a list which is made up by all the latest docfile of every bibdoc (of a particular doctype). """ docfiles = [] for bibdoc in self.list_bibdocs(doctype): docfiles += bibdoc.list_latest_files() return docfiles def display(self, docname="", version="", doctype="", ln=CFG_SITE_LANG, verbose=0, display_hidden=True): """Returns a formatted panel with information and links about a given docid of a particular version (or any), of a particular doctype (or any) """ t = "" if docname: try: bibdocs = [self.get_bibdoc(docname)] except InvenioWebSubmitFileError: bibdocs = self.list_bibdocs(doctype) else: bibdocs = self.list_bibdocs(doctype) if bibdocs: types = list_types_from_array(bibdocs) fulltypes = [] for mytype in types: fulltype = { 'name' : mytype, 'content' : [], } for bibdoc in bibdocs: if mytype == bibdoc.get_type(): fulltype['content'].append(bibdoc.display(version, ln=ln, display_hidden=display_hidden)) fulltypes.append(fulltype) if verbose >= 9: verbose_files = str(self) else: verbose_files = '' t = websubmit_templates.tmpl_bibrecdoc_filelist( ln=ln, types = fulltypes, verbose_files=verbose_files ) return t def fix(self, docname): """Algorithm that transform an a broken/old bibdoc into a coherent one: i.e. the corresponding folder will have files named after the bibdoc name. Proper .recid, .type, .md5 files will be created/updated. In case of more than one file with the same format revision a new bibdoc will be created in order to put does files. Returns the list of newly created bibdocs if any. """ bibdoc = self.get_bibdoc(docname) versions = {} res = [] new_bibdocs = [] # List of files with the same version/format of # existing file which need new bibdoc. counter = 0 zero_version_bug = False if os.path.exists(bibdoc.basedir): for filename in os.listdir(bibdoc.basedir): if filename[0] != '.' and ';' in filename: name, version = filename.split(';') try: version = int(version) except ValueError: # Strange name register_exception() raise InvenioWebSubmitFileError, "A file called %s exists under %s. This is not a valid name. After the ';' there must be an integer representing the file revision. Please, manually fix this file either by renaming or by deleting it." % (filename, bibdoc.basedir) if version == 0: zero_version_bug = True format = name[len(file_strip_ext(name)):] format = normalize_format(format) if not versions.has_key(version): versions[version] = {} new_name = 'FIXING-%s-%s' % (str(counter), name) try: shutil.move('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, new_name)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in renaming '%s' to '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, new_name), e) if versions[version].has_key(format): new_bibdocs.append((new_name, version)) else: versions[version][format] = new_name counter += 1 elif filename[0] != '.': # Strange name register_exception() raise InvenioWebSubmitFileError, "A file called %s exists under %s. This is not a valid name. There should be a ';' followed by an integer representing the file revision. Please, manually fix this file either by renaming or by deleting it." % (filename, bibdoc.basedir) else: # we create the corresponding storage directory old_umask = os.umask(022) os.makedirs(bibdoc.basedir) # and save the father record id if it exists try: if self.id != "": recid_fd = open("%s/.recid" % bibdoc.basedir, "w") recid_fd.write(str(self.id)) recid_fd.close() if bibdoc.doctype != "": type_fd = open("%s/.type" % bibdoc.basedir, "w") type_fd.write(str(bibdoc.doctype)) type_fd.close() except Exception, e: register_exception() raise InvenioWebSubmitFileError, e os.umask(old_umask) if not versions: bibdoc.delete() else: for version, formats in versions.iteritems(): if zero_version_bug: version += 1 for format, filename in formats.iteritems(): destination = '%s%s;%i' % (docname, format, version) try: shutil.move('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, destination)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in renaming '%s' to '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, destination), e) try: recid_fd = open("%s/.recid" % bibdoc.basedir, "w") recid_fd.write(str(self.id)) recid_fd.close() type_fd = open("%s/.type" % bibdoc.basedir, "w") type_fd.write(str(bibdoc.doctype)) type_fd.close() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in creating .recid and .type file for '%s' folder: '%s'" % (bibdoc.basedir, e) self.build_bibdoc_list() res = [] for (filename, version) in new_bibdocs: if zero_version_bug: version += 1 new_bibdoc = self.add_bibdoc(doctype=bibdoc.doctype, docname=docname, never_fail=True) new_bibdoc.add_file_new_format('%s/%s' % (bibdoc.basedir, filename), version) res.append(new_bibdoc) try: os.remove('%s/%s' % (bibdoc.basedir, filename)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in removing '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), e) Md5Folder(bibdoc.basedir).update(only_new=False) bibdoc._build_file_list() self.build_bibdoc_list() for bibdoc in self.bibdocs: if not run_sql('SELECT more_info FROM bibdoc WHERE id=%s', (bibdoc.id,)): ## Import from MARC only if the bibdoc has never had ## its more_info initialized. try: bibdoc.import_descriptions_and_comments_from_marc() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in importing description and comment from %s for record %s: %s" % (repr(bibdoc), self.id, e) return res def check_format(self, docname): """In case CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS is altered or Python version changes, it might happen that a docname contains files which are no more docname + .format ; version, simply because the .format is now recognized (and it was not before, so it was contained into the docname). This algorithm verify if it is necessary to fix. Return True if format is correct. False if a fix is needed.""" bibdoc = self.get_bibdoc(docname) correct_docname = decompose_file(docname)[1] if docname != correct_docname: return False for filename in os.listdir(bibdoc.basedir): if not filename.startswith('.'): try: dummy, dummy, format, version = decompose_file_with_version(filename) except: raise InvenioWebSubmitFileError('Incorrect filename "%s" for docname %s for recid %i' % (filename, docname, self.id)) if '%s%s;%i' % (correct_docname, format, version) != filename: return False return True def check_duplicate_docnames(self): """Check wethever the record is connected with at least tho bibdoc with the same docname. Return True if everything is fine. """ docnames = set() for docname in self.get_bibdoc_names(): if docname in docnames: return False else: docnames.add(docname) return True def uniformize_bibdoc(self, docname): """This algorithm correct wrong file name belonging to a bibdoc.""" bibdoc = self.get_bibdoc(docname) for filename in os.listdir(bibdoc.basedir): if not filename.startswith('.'): try: dummy, dummy, format, version = decompose_file_with_version(filename) except ValueError: register_exception(alert_admin=True, prefix= "Strange file '%s' is stored in %s" % (filename, bibdoc.basedir)) else: os.rename(os.path.join(bibdoc.basedir, filename), os.path.join(bibdoc.basedir, '%s%s;%i' % (docname, format, version))) Md5Folder(bibdoc.basedir).update() bibdoc.touch() bibdoc._build_file_list('rename') def fix_format(self, docname, skip_check=False): """ Fixing this situation require different steps, because docname might already exists. This algorithm try to fix this situation. In case a merging is needed the algorithm return False if the merging is not possible. """ if not skip_check: if self.check_format(docname): return True bibdoc = self.get_bibdoc(docname) correct_docname = decompose_file(docname)[1] need_merge = False if correct_docname != docname: need_merge = self.has_docname_p(correct_docname) if need_merge: proposed_docname = self.propose_unique_docname(correct_docname) run_sql('UPDATE bibdoc SET docname=%s WHERE id=%s', (proposed_docname, bibdoc.id)) self.build_bibdoc_list() self.uniformize_bibdoc(proposed_docname) try: self.merge_bibdocs(docname, proposed_docname) except InvenioWebSubmitFileError: return False else: run_sql('UPDATE bibdoc SET docname=%s WHERE id=%s', (correct_docname, bibdoc.id)) self.build_bibdoc_list() self.uniformize_bibdoc(correct_docname) else: self.uniformize_bibdoc(docname) return True def fix_duplicate_docnames(self, skip_check=False): """Algotirthm to fix duplicate docnames. If a record is connected with at least two bibdoc having the same docname, the algorithm will try to merge them. """ if not skip_check: if self.check_duplicate_docnames(): return docnames = set() for bibdoc in self.list_bibdocs(): docname = bibdoc.docname if docname in docnames: new_docname = self.propose_unique_docname(bibdoc.docname) bibdoc.change_name(new_docname) self.merge_bibdocs(docname, new_docname) docnames.add(docname) class BibDoc: """this class represents one file attached to a record there is a one to one mapping between an instance of this class and an entry in the bibdoc db table""" def __init__ (self, docid="", recid="", docname="file", doctype="Main", human_readable=False): """Constructor of a bibdoc. At least the docid or the recid/docname pair is needed.""" # docid is known, the document already exists docname = normalize_docname(docname) self.docfiles = [] self.md5s = None self.related_files = [] self.human_readable = human_readable if docid != "": if recid == "": recid = None self.doctype = "" res = run_sql("select id_bibrec,type from bibrec_bibdoc " "where id_bibdoc=%s", (docid,)) if len(res) > 0: recid = res[0][0] self.doctype = res[0][1] else: res = run_sql("select id_bibdoc1 from bibdoc_bibdoc " "where id_bibdoc2=%s", (docid,)) if len(res) > 0 : main_bibdoc = res[0][0] res = run_sql("select id_bibrec,type from bibrec_bibdoc " "where id_bibdoc=%s", (main_bibdoc,)) if len(res) > 0: recid = res[0][0] self.doctype = res[0][1] else: res = run_sql("select type from bibrec_bibdoc " "where id_bibrec=%s and id_bibdoc=%s", (recid, docid,)) if len(res) > 0: self.doctype = res[0][0] else: #this bibdoc isn't associated with the corresponding bibrec. raise InvenioWebSubmitFileError, "No docid associated with the recid %s" % recid # gather the other information res = run_sql("select id,status,docname,creation_date," "modification_date,more_info from bibdoc where id=%s", (docid,)) if len(res) > 0: self.cd = res[0][3] self.md = res[0][4] self.recid = recid self.docname = res[0][2] self.id = docid self.status = res[0][1] self.more_info = BibDocMoreInfo(docid, blob_to_string(res[0][5])) self.basedir = _make_base_dir(self.id) else: # this bibdoc doesn't exist raise InvenioWebSubmitFileError, "The docid %s does not exist." % docid # else it is a new document else: if docname == "" or doctype == "": raise InvenioWebSubmitFileError, "Argument missing for creating a new bibdoc" else: self.recid = recid self.doctype = doctype self.docname = docname self.status = '' if recid: res = run_sql("SELECT b.id FROM bibrec_bibdoc bb JOIN bibdoc b on bb.id_bibdoc=b.id WHERE bb.id_bibrec=%s AND b.docname=%s", (recid, docname)) if res: raise InvenioWebSubmitFileError, "A bibdoc called %s already exists for recid %s" % (docname, recid) self.id = run_sql("INSERT INTO bibdoc (status,docname,creation_date,modification_date) " "values(%s,%s,NOW(),NOW())", (self.status, docname)) if self.id is not None: # we link the document to the record if a recid was # specified self.more_info = BibDocMoreInfo(self.id) res = run_sql("SELECT creation_date, modification_date FROM bibdoc WHERE id=%s", (self.id,)) self.cd = res[0][0] self.md = res[0][0] else: raise InvenioWebSubmitFileError, "New docid cannot be created" try: self.basedir = _make_base_dir(self.id) # we create the corresponding storage directory if not os.path.exists(self.basedir): old_umask = os.umask(022) os.makedirs(self.basedir) # and save the father record id if it exists try: if self.recid != "": recid_fd = open("%s/.recid" % self.basedir, "w") recid_fd.write(str(self.recid)) recid_fd.close() if self.doctype != "": type_fd = open("%s/.type" % self.basedir, "w") type_fd.write(str(self.doctype)) type_fd.close() except Exception, e: register_exception() raise InvenioWebSubmitFileError, e os.umask(old_umask) if self.recid != "": run_sql("INSERT INTO bibrec_bibdoc (id_bibrec, id_bibdoc, type) VALUES (%s,%s,%s)", (recid, self.id, self.doctype,)) except Exception, e: run_sql('DELETE FROM bibdoc WHERE id=%s', (self.id, )) run_sql('DELETE FROM bibrec_bibdoc WHERE id_bibdoc=%s', (self.id, )) register_exception() raise InvenioWebSubmitFileError, e # build list of attached files self._build_file_list('init') # link with related_files self._build_related_file_list() def __repr__(self): return 'BibDoc(%s, %s, %s, %s)' % (repr(self.id), repr(self.recid), repr(self.docname), repr(self.doctype)) def __str__(self): out = '%s:%i:::docname=%s\n' % (self.recid or '', self.id, self.docname) out += '%s:%i:::doctype=%s\n' % (self.recid or '', self.id, self.doctype) out += '%s:%i:::status=%s\n' % (self.recid or '', self.id, self.status) out += '%s:%i:::basedir=%s\n' % (self.recid or '', self.id, self.basedir) out += '%s:%i:::creation date=%s\n' % (self.recid or '', self.id, self.cd) out += '%s:%i:::modification date=%s\n' % (self.recid or '', self.id, self.md) out += '%s:%i:::total file attached=%s\n' % (self.recid or '', self.id, len(self.docfiles)) if self.human_readable: out += '%s:%i:::total size latest version=%s\n' % (self.recid or '', self.id, nice_size(self.get_total_size_latest_version())) out += '%s:%i:::total size all files=%s\n' % (self.recid or '', self.id, nice_size(self.get_total_size())) else: out += '%s:%i:::total size latest version=%s\n' % (self.recid or '', self.id, self.get_total_size_latest_version()) out += '%s:%i:::total size all files=%s\n' % (self.recid or '', self.id, self.get_total_size()) for docfile in self.docfiles: out += str(docfile) icon = self.get_icon() if icon: out += str(self.get_icon()) return out def format_already_exists_p(self, format): """Return True if the given format already exists among the latest files.""" format = normalize_format(format) for afile in self.list_latest_files(): if format == afile.get_format(): return True return False def get_status(self): """Retrieve the status.""" return self.status def touch(self): """Update the modification time of the bibdoc.""" run_sql('UPDATE bibdoc SET modification_date=NOW() WHERE id=%s', (self.id, )) #if self.recid: #run_sql('UPDATE bibrec SET modification_date=NOW() WHERE id=%s', (self.recid, )) def set_status(self, new_status): """Set a new status.""" if new_status != KEEP_OLD_VALUE: if new_status == 'DELETED': raise InvenioWebSubmitFileError('DELETED is a reserved word and can not be used for setting the status') run_sql('UPDATE bibdoc SET status=%s WHERE id=%s', (new_status, self.id)) self.status = new_status self.touch() self._build_file_list() self._build_related_file_list() def add_file_new_version(self, filename, description=None, comment=None, format=None, hide_previous_versions=False): """Add a new version of a file.""" try: latestVersion = self.get_latest_version() if latestVersion == 0: myversion = 1 else: myversion = latestVersion + 1 if os.path.exists(filename): if not os.path.getsize(filename) > 0: raise InvenioWebSubmitFileError, "%s seems to be empty" % filename if format is None: format = decompose_file(filename)[2] destination = "%s/%s%s;%i" % (self.basedir, self.docname, format, myversion) try: shutil.copyfile(filename, destination) os.chmod(destination, 0644) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (filename, destination, e) self.more_info.set_description(description, format, myversion) self.more_info.set_comment(comment, format, myversion) for afile in self.list_all_files(): format = afile.get_format() version = afile.get_version() if version < myversion: self.more_info.set_hidden(hide_previous_versions, format, myversion) else: raise InvenioWebSubmitFileError, "'%s' does not exists!" % filename finally: self.touch() Md5Folder(self.basedir).update() self._build_file_list() def purge(self): """Phisically Remove all the previous version of the given bibdoc""" version = self.get_latest_version() if version > 1: for afile in self.docfiles: if afile.get_version() < version: self.more_info.unset_comment(afile.get_format(), afile.get_version()) self.more_info.unset_description(afile.get_format(), afile.get_version()) self.more_info.unset_hidden(afile.get_format(), afile.get_version()) try: os.remove(afile.get_full_path()) except Exception, e: register_exception() Md5Folder(self.basedir).update() self.touch() self._build_file_list() def expunge(self): """Phisically remove all the traces of a given bibdoc note that you should not use any more this object or unpredictable things will happen.""" del self.md5s del self.more_info os.system('rm -rf %s' % escape_shell_arg(self.basedir)) run_sql('DELETE FROM bibrec_bibdoc WHERE id_bibdoc=%s', (self.id, )) run_sql('DELETE FROM bibdoc_bibdoc WHERE id_bibdoc1=%s OR id_bibdoc2=%s', (self.id, self.id)) run_sql('DELETE FROM bibdoc WHERE id=%s', (self.id, )) run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, doctimestamp) VALUES("EXPUNGE", %s, %s, NOW())', (self.id, self.docname)) del self.docfiles del self.id del self.cd del self.md del self.basedir del self.recid del self.doctype del self.docname def revert(self, version): """Revert to a given version by copying its differnt formats to a new version.""" try: version = int(version) new_version = self.get_latest_version() + 1 for docfile in self.list_version_files(version): destination = "%s/%s%s;%i" % (self.basedir, self.docname, docfile.get_format(), new_version) if os.path.exists(destination): raise InvenioWebSubmitFileError, "A file for docname '%s' for the recid '%s' already exists for the format '%s'" % (self.docname, self.recid, docfile.get_format()) try: shutil.copyfile(docfile.get_full_path(), destination) os.chmod(destination, 0644) self.more_info.set_comment(self.more_info.get_comment(docfile.get_format(), version), docfile.get_format(), new_version) self.more_info.set_description(self.more_info.get_description(docfile.get_format(), version), docfile.get_format(), new_version) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (docfile.get_full_path(), destination, e) finally: Md5Folder(self.basedir).update() self.touch() self._build_file_list() def import_descriptions_and_comments_from_marc(self, record=None): """Import description & comment from the corresponding marc. if record is passed it is directly used, otherwise it is calculated after the xm stored in the database.""" ## Let's get the record from invenio.search_engine import get_record if record is None: record = get_record(self.id) fields = record_get_field_instances(record, '856', '4', ' ') global_comment = None global_description = None local_comment = {} local_description = {} for field in fields: url = field_get_subfield_values(field, 'u') if url: ## Given a url url = url[0] if url == '%s/record/%s/files/' % (CFG_SITE_URL, self.recid): ## If it is a traditional /record/1/files/ one ## We have global description/comment for all the formats description = field_get_subfield_values(field, 'y') if description: global_description = description[0] comment = field_get_subfield_values(field, 'z') if comment: global_comment = comment[0] elif bibdocfile_url_p(url): ## Otherwise we have description/comment per format dummy, docname, format = decompose_bibdocfile_url(url) if docname == self.docname: description = field_get_subfield_values(field, 'y') if description: local_description[format] = description[0] comment = field_get_subfield_values(field, 'z') if comment: local_comment[format] = comment[0] ## Let's update the tables version = self.get_latest_version() for docfile in self.list_latest_files(): format = docfile.get_format() if format in local_comment: self.set_comment(local_comment[format], format, version) else: self.set_comment(global_comment, format, version) if format in local_description: self.set_description(local_description[format], format, version) else: self.set_description(global_description, format, version) self._build_file_list('init') def add_file_new_format(self, filename, version=None, description=None, comment=None, format=None): """add a new format of a file to an archive""" try: if version is None: version = self.get_latest_version() if version == 0: version = 1 if os.path.exists(filename): if not os.path.getsize(filename) > 0: raise InvenioWebSubmitFileError, "%s seems to be empty" % filename if format is None: format = decompose_file(filename)[2] destination = "%s/%s%s;%i" % (self.basedir, self.docname, format, version) if os.path.exists(destination): raise InvenioWebSubmitFileError, "A file for docname '%s' for the recid '%s' already exists for the format '%s'" % (self.docname, self.recid, format) try: shutil.copyfile(filename, destination) os.chmod(destination, 0644) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (filename, destination, e) self.more_info.set_comment(comment, format, version) self.more_info.set_description(description, format, version) else: raise InvenioWebSubmitFileError, "'%s' does not exists!" % filename finally: Md5Folder(self.basedir).update() self.touch() self._build_file_list() def get_icon(self): """Returns the bibdoc corresponding to an icon of the given bibdoc.""" if self.related_files.has_key('Icon'): return self.related_files['Icon'][0] else: return None def add_icon(self, filename, basename=None, format=None): """Links an icon with the bibdoc object. Return the icon bibdoc""" #first check if an icon already exists existing_icon = self.get_icon() if existing_icon is not None: existing_icon.delete() #then add the new one if basename is None: basename = 'icon-%s' % self.docname if format is None: format = decompose_file(filename)[2] newicon = BibDoc(doctype='Icon', docname=basename, human_readable=self.human_readable) newicon.add_file_new_version(filename, format=format) try: try: old_umask = os.umask(022) recid_fd = open("%s/.docid" % newicon.get_base_dir(), "w") recid_fd.write(str(self.id)) recid_fd.close() type_fd = open("%s/.type" % newicon.get_base_dir(), "w") type_fd.write(str(self.doctype)) type_fd.close() os.umask(old_umask) run_sql("INSERT INTO bibdoc_bibdoc (id_bibdoc1, id_bibdoc2, type) VALUES (%s,%s,'Icon')", (self.id, newicon.get_id(),)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while writing .docid and .doctype for folder '%s': '%s'" % (newicon.get_base_dir(), e) finally: Md5Folder(newicon.basedir).update() self.touch() self._build_related_file_list() return newicon def delete_icon(self): """Removes the current icon if it exists.""" existing_icon = self.get_icon() if existing_icon is not None: existing_icon.delete() self.touch() self._build_related_file_list() def display(self, version="", ln=CFG_SITE_LANG, display_hidden=True): """Returns a formatted representation of the files linked with the bibdoc. """ t = "" if version == "all": docfiles = self.list_all_files(list_hidden=display_hidden) elif version != "": version = int(version) docfiles = self.list_version_files(version, list_hidden=display_hidden) else: docfiles = self.list_latest_files() existing_icon = self.get_icon() if existing_icon is not None: existing_icon = existing_icon.list_all_files()[0] imageurl = "%s/record/%s/files/%s" % \ (CFG_SITE_URL, self.recid, urllib.quote(existing_icon.get_full_name())) else: imageurl = "%s/img/smallfiles.gif" % CFG_SITE_URL versions = [] for version in list_versions_from_array(docfiles): currversion = { 'version' : version, 'previous' : 0, 'content' : [] } if version == self.get_latest_version() and version != 1: currversion['previous'] = 1 for docfile in docfiles: if docfile.get_version() == version: currversion['content'].append(docfile.display(ln = ln)) versions.append(currversion) t = websubmit_templates.tmpl_bibdoc_filelist( ln = ln, versions = versions, imageurl = imageurl, docname = self.docname, recid = self.recid ) return t def change_name(self, newname): """Rename the bibdoc name. New name must not be already used by the linked bibrecs.""" try: newname = normalize_docname(newname) res = run_sql("SELECT b.id FROM bibrec_bibdoc bb JOIN bibdoc b on bb.id_bibdoc=b.id WHERE bb.id_bibrec=%s AND b.docname=%s", (self.recid, newname)) if res: raise InvenioWebSubmitFileError, "A bibdoc called %s already exists for recid %s" % (newname, self.recid) try: for f in os.listdir(self.basedir): if not f.startswith('.'): try: (dummy, base, extension, version) = decompose_file_with_version(f) except ValueError: register_exception(alert_admin=True, prefix="Strange file '%s' is stored in %s" % (f, self.basedir)) else: shutil.move(os.path.join(self.basedir, f), os.path.join(self.basedir, '%s%s;%i' % (newname, extension, version))) except Exception, e: register_exception() raise InvenioWebSubmitFileError("Error in renaming the bibdoc %s to %s for recid %s: %s" % (self.docname, newname, self.recid, e)) run_sql("update bibdoc set docname=%s where id=%s", (newname, self.id,)) self.docname = newname finally: Md5Folder(self.basedir).update() self.touch() self._build_file_list('rename') self._build_related_file_list() def set_comment(self, comment, format, version=None): """Update the comment of a format/version.""" if version is None: version = self.get_latest_version() self.more_info.set_comment(comment, format, version) self.touch() self._build_file_list('init') def set_description(self, description, format, version=None): """Update the description of a format/version.""" if version is None: version = self.get_latest_version() self.more_info.set_description(description, format, version) self.touch() self._build_file_list('init') def set_hidden(self, hidden, format, version=None): """Update the hidden flag for format/version.""" if version is None: version = self.get_latest_version() self.more_info.set_hidden(hidden, format, version) self.touch() self._build_file_list('init') def get_comment(self, format, version=None): """Get a comment for a given format/version.""" if version is None: version = self.get_latest_version() return self.more_info.get_comment(format, version) def get_description(self, format, version=None): """Get a description for a given format/version.""" if version is None: version = self.get_latest_version() return self.more_info.get_description(format, version) def hidden_p(self, format, version=None): """Is the format/version hidden?""" if version is None: version = self.get_latest_version() return self.more_info.hidden_p(format, version) def icon_p(self): """Return True if this bibdoc correspond to an icon which is linked to another bibdoc.""" return run_sql("SELECT count(id_bibdoc2) FROM bibdoc_bibdoc WHERE id_bibdoc2=%s AND type='Icon'", (self.id, ))[0][0] > 0 def get_docname(self): """retrieve bibdoc name""" return self.docname def get_base_dir(self): """retrieve bibdoc base directory, e.g. /soft/cdsweb/var/data/files/123""" return self.basedir def get_type(self): """retrieve bibdoc doctype""" return self.doctype def get_recid(self): """retrieve bibdoc recid""" return self.recid def get_id(self): """retrieve bibdoc id""" return self.id def get_file(self, format, version=""): """Return a DocFile with docname name, with format (the extension), and with the given version. """ if version == "": docfiles = self.list_latest_files() else: version = int(version) docfiles = self.list_version_files(version) format = normalize_format(format) for docfile in docfiles: if (docfile.get_format()==format or not format): return docfile raise InvenioWebSubmitFileError, "No file called '%s' of format '%s', version '%s'" % (self.docname, format, version) def list_versions(self): """Returns the list of existing version numbers for a given bibdoc.""" versions = [] for docfile in self.docfiles: if not docfile.get_version() in versions: versions.append(docfile.get_version()) return versions def delete(self): """delete the current bibdoc instance.""" try: today = datetime.today() self.change_name('DELETED-%s%s-%s' % (today.strftime('%Y%m%d%H%M%S'), today.microsecond, self.docname)) run_sql("UPDATE bibdoc SET status='DELETED' WHERE id=%s", (self.id,)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "It's impossible to delete bibdoc %s: %s" % (self.id, e) def deleted_p(self): """Return True if the bibdoc has been deleted.""" return self.status == 'DELETED' def empty_p(self): """Return True if the bibdoc is empty, i.e. it has no bibdocfile connected.""" return len(self.docfiles) == 0 def undelete(self, previous_status=''): """undelete a deleted file (only if it was actually deleted). The previous status, i.e. the restriction key can be provided. Otherwise the bibdoc will pe public.""" bibrecdocs = BibRecDocs(self.recid) try: run_sql("UPDATE bibdoc SET status=%s WHERE id=%s AND status='DELETED'", (previous_status, self.id)) except Exception, e: raise InvenioWebSubmitFileError, "It's impossible to undelete bibdoc %s: %s" % (self.id, e) if self.docname.startswith('DELETED-'): try: # Let's remove DELETED-20080214144322- in front of the docname original_name = '-'.join(self.docname.split('-')[2:]) original_name = bibrecdocs.propose_unique_docname(original_name) self.change_name(original_name) except Exception, e: raise InvenioWebSubmitFileError, "It's impossible to restore the previous docname %s. %s kept as docname because: %s" % (original_name, self.docname, e) else: raise InvenioWebSubmitFileError, "Strange just undeleted docname isn't called DELETED-somedate-docname but %s" % self.docname def delete_file(self, format, version): """Delete on the filesystem the particular format version. Note, this operation is not reversible!""" try: afile = self.get_file(format, version) except InvenioWebSubmitFileError: return try: os.remove(afile.get_full_path()) except OSError: pass self.touch() self._build_file_list() def get_history(self): """Return a string with a line for each row in the history for the given docid.""" ret = [] hst = run_sql("""SELECT action, docname, docformat, docversion, docsize, docchecksum, doctimestamp FROM hstDOCUMENT WHERE id_bibdoc=%s ORDER BY doctimestamp ASC""", (self.id, )) for row in hst: ret.append("%s %s '%s', format: '%s', version: %i, size: %s, checksum: '%s'" % (row[6].strftime('%Y-%m-%d %H:%M:%S'), row[0], row[1], row[2], row[3], nice_size(row[4]), row[5])) return ret def _build_file_list(self, context=''): """Lists all files attached to the bibdoc. This function should be called everytime the bibdoc is modified. As a side effect it log everything that has happened to the bibdocfiles in the log facility, according to the context: "init": means that the function has been called; for the first time by a constructor, hence no logging is performed "": by default means to log every deleted file as deleted and every added file as added; "rename": means that every appearently deleted file is logged as renamef and every new file as renamet. """ def log_action(action, docid, docname, format, version, size, checksum, timestamp=''): """Log an action into the bibdoclog table.""" try: if timestamp: run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, docformat, docversion, docsize, docchecksum, doctimestamp) VALUES(%s, %s, %s, %s, %s, %s, %s, %s)', (action, docid, docname, format, version, size, checksum, timestamp)) else: run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, docformat, docversion, docsize, docchecksum, doctimestamp) VALUES(%s, %s, %s, %s, %s, %s, %s, NOW())', (action, docid, docname, format, version, size, checksum)) except DatabaseError: register_exception() def make_removed_added_bibdocfiles(previous_file_list): """Internal function for build the log of changed files.""" # Let's rebuild the previous situation old_files = {} for bibdocfile in previous_file_list: old_files[(bibdocfile.name, bibdocfile.format, bibdocfile.version)] = (bibdocfile.size, bibdocfile.checksum, bibdocfile.md) # Let's rebuild the new situation new_files = {} for bibdocfile in self.docfiles: new_files[(bibdocfile.name, bibdocfile.format, bibdocfile.version)] = (bibdocfile.size, bibdocfile.checksum, bibdocfile.md) # Let's subtract from added file all the files that are present in # the old list, and let's add to deleted files that are not present # added file. added_files = dict(new_files) deleted_files = {} for key, value in old_files.iteritems(): if added_files.has_key(key): del added_files[key] else: deleted_files[key] = value return (added_files, deleted_files) if context != 'init': previous_file_list = list(self.docfiles) self.docfiles = [] if os.path.exists(self.basedir): self.md5s = Md5Folder(self.basedir) files = os.listdir(self.basedir) files.sort() for afile in files: if not afile.startswith('.'): try: filepath = os.path.join(self.basedir, afile) fileversion = int(re.sub(".*;", "", afile)) fullname = afile.replace(";%s" % fileversion, "") checksum = self.md5s.get_checksum(afile) (dirname, basename, format) = decompose_file(fullname) comment = self.more_info.get_comment(format, fileversion) description = self.more_info.get_description(format, fileversion) hidden = self.more_info.hidden_p(format, fileversion) # we can append file: self.docfiles.append(BibDocFile(filepath, self.doctype, fileversion, basename, format, self.recid, self.id, self.status, checksum, description, comment, hidden, human_readable=self.human_readable)) except Exception, e: register_exception() if context == 'init': return else: added_files, deleted_files = make_removed_added_bibdocfiles(previous_file_list) deletedstr = "DELETED" addedstr = "ADDED" if context == 'rename': deletedstr = "RENAMEDFROM" addedstr = "RENAMEDTO" for (docname, format, version), (size, checksum, md) in added_files.iteritems(): if context == 'rename': md = '' # No modification time log_action(addedstr, self.id, docname, format, version, size, checksum, md) for (docname, format, version), (size, checksum, md) in deleted_files.iteritems(): if context == 'rename': md = '' # No modification time log_action(deletedstr, self.id, docname, format, version, size, checksum, md) def _build_related_file_list(self): """Lists all files attached to the bibdoc. This function should be called everytime the bibdoc is modified within e.g. its icon. """ self.related_files = {} res = run_sql("SELECT ln.id_bibdoc2,ln.type,bibdoc.status FROM " "bibdoc_bibdoc AS ln,bibdoc WHERE id=ln.id_bibdoc2 AND " "ln.id_bibdoc1=%s", (self.id,)) for row in res: docid = row[0] doctype = row[1] if row[2] != 'DELETED': if not self.related_files.has_key(doctype): self.related_files[doctype] = [] cur_doc = BibDoc(docid=docid, human_readable=self.human_readable) self.related_files[doctype].append(cur_doc) def get_total_size_latest_version(self): """Return the total size used on disk of all the files belonging to this bibdoc and corresponding to the latest version.""" ret = 0 for bibdocfile in self.list_latest_files(): ret += bibdocfile.get_size() return ret def get_total_size(self): """Return the total size used on disk of all the files belonging to this bibdoc.""" ret = 0 for bibdocfile in self.list_all_files(): ret += bibdocfile.get_size() return ret def list_all_files(self, list_hidden=True): """Returns all the docfiles linked with the given bibdoc.""" if list_hidden: return self.docfiles else: return [afile for afile in self.docfiles if not afile.hidden_p()] def list_latest_files(self): """Returns all the docfiles within the last version.""" return self.list_version_files(self.get_latest_version()) def list_version_files(self, version, list_hidden=True): """Return all the docfiles of a particular version.""" version = int(version) return [docfile for docfile in self.docfiles if docfile.get_version() == version and (list_hidden or not docfile.hidden_p)] def get_latest_version(self): """ Returns the latest existing version number for the given bibdoc. If no file is associated to this bibdoc, returns '0'. """ version = 0 for bibdocfile in self.docfiles: if bibdocfile.get_version() > version: version = bibdocfile.get_version() return version def get_file_number(self): """Return the total number of files.""" return len(self.docfiles) def register_download(self, ip_address, version, format, userid=0): """Register the information about a download of a particular file.""" format = normalize_format(format) if format[:1] == '.': format = format[1:] format = format.upper() return run_sql("INSERT INTO rnkDOWNLOADS " "(id_bibrec,id_bibdoc,file_version,file_format," "id_user,client_host,download_time) VALUES " "(%s,%s,%s,%s,%s,INET_ATON(%s),NOW())", (self.recid, self.id, version, format, userid, ip_address,)) class BibDocFile: """This class represents a physical file in the CDS Invenio filesystem. It should never be instantiated directly""" def __init__(self, fullpath, doctype, version, name, format, recid, docid, status, checksum, description=None, comment=None, hidden=False, human_readable=False): self.fullpath = fullpath self.doctype = doctype self.docid = docid self.recid = recid self.version = version self.status = status self.checksum = checksum self.description = description self.comment = comment self.hidden = hidden self.human_readable = human_readable self.size = os.path.getsize(fullpath) self.md = datetime.fromtimestamp(os.path.getmtime(fullpath)) try: self.cd = datetime.fromtimestamp(os.path.getctime(fullpath)) except OSError: self.cd = self.md self.name = name self.format = normalize_format(format) self.dir = os.path.dirname(fullpath) self.url = '%s/record/%s/files/%s%s' % (CFG_SITE_URL, self.recid, urllib.quote(self.name), urllib.quote(self.format)) self.fullurl = '%s?version=%s' % (self.url, self.version) self.etag = '"%i%s%i"' % (self.docid, self.format, self.version) if format == "": self.mime = "application/octet-stream" self.encoding = "" self.fullname = name else: self.fullname = "%s%s" % (name, self.format) (self.mime, self.encoding) = _mimes.guess_type(self.fullname) if self.mime is None: self.mime = "application/octet-stream" self.magic = None def __repr__(self): return 'BibDocFile(%s, %s, %i, %s, %s, %i, %i, %s, %s, %s, %s, %s, %s)' % (repr(self.fullpath), repr(self.doctype), self.version, repr(self.name), repr(self.format), self.recid, self.docid, repr(self.status), repr(self.checksum), repr(self.description), repr(self.comment), repr(self.hidden), repr(self.human_readable)) def __str__(self): out = '%s:%s:%s:%s:fullpath=%s\n' % (self.recid, self.docid, self.version, self.format, self.fullpath) out += '%s:%s:%s:%s:fullname=%s\n' % (self.recid, self.docid, self.version, self.format, self.fullname) out += '%s:%s:%s:%s:name=%s\n' % (self.recid, self.docid, self.version, self.format, self.name) out += '%s:%s:%s:%s:status=%s\n' % (self.recid, self.docid, self.version, self.format, self.status) out += '%s:%s:%s:%s:checksum=%s\n' % (self.recid, self.docid, self.version, self.format, self.checksum) if self.human_readable: out += '%s:%s:%s:%s:size=%s\n' % (self.recid, self.docid, self.version, self.format, nice_size(self.size)) else: out += '%s:%s:%s:%s:size=%s\n' % (self.recid, self.docid, self.version, self.format, self.size) out += '%s:%s:%s:%s:creation time=%s\n' % (self.recid, self.docid, self.version, self.format, self.cd) out += '%s:%s:%s:%s:modification time=%s\n' % (self.recid, self.docid, self.version, self.format, self.md) out += '%s:%s:%s:%s:magic=%s\n' % (self.recid, self.docid, self.version, self.format, self.get_magic()) out += '%s:%s:%s:%s:mime=%s\n' % (self.recid, self.docid, self.version, self.format, self.mime) out += '%s:%s:%s:%s:encoding=%s\n' % (self.recid, self.docid, self.version, self.format, self.encoding) out += '%s:%s:%s:%s:url=%s\n' % (self.recid, self.docid, self.version, self.format, self.url) out += '%s:%s:%s:%s:fullurl=%s\n' % (self.recid, self.docid, self.version, self.format, self.fullurl) out += '%s:%s:%s:%s:description=%s\n' % (self.recid, self.docid, self.version, self.format, self.description) out += '%s:%s:%s:%s:comment=%s\n' % (self.recid, self.docid, self.version, self.format, self.comment) out += '%s:%s:%s:%s:hidden=%s\n' % (self.recid, self.docid, self.version, self.format, self.hidden) out += '%s:%s:%s:%s:etag=%s\n' % (self.recid, self.docid, self.version, self.format, self.etag) return out def display(self, ln = CFG_SITE_LANG): """Returns a formatted representation of this docfile.""" return websubmit_templates.tmpl_bibdocfile_filelist( ln = ln, recid = self.recid, version = self.version, name = self.name, format = self.format, size = self.size, description = self.description or '' ) def is_restricted(self, req): """Returns restriction state. (see acc_authorize_action return values)""" if self.status not in ('', 'DELETED'): return acc_authorize_action(req, 'viewrestrdoc', status=self.status) elif self.status == 'DELETED': return (1, 'File has ben deleted') else: return (0, '') def hidden_p(self): return self.hidden def get_url(self): return self.url def get_type(self): return self.doctype def get_path(self): return self.fullpath def get_bibdocid(self): return self.docid def get_name(self): return self.name def get_full_name(self): return self.fullname def get_full_path(self): return self.fullpath def get_format(self): return self.format def get_size(self): return self.size def get_version(self): return self.version def get_checksum(self): return self.checksum def get_description(self): return self.description def get_comment(self): return self.comment def get_content(self): """Returns the binary content of the file.""" content_fd = open(self.fullpath, 'rb') content = content_fd.read() content_fd.close() return content def get_recid(self): """Returns the recid connected with the bibdoc of this file.""" return self.recid def get_status(self): """Returns the status of the file, i.e. either '', 'DELETED' or a restriction keyword.""" return self.status def get_magic(self): """Return all the possible guesses from the magic library about the content of the file.""" if self.magic is None and CFG_HAS_MAGIC: magic_cookies = get_magic_cookies() magic_result = [] for key in magic_cookies.keys(): magic_result.append(magic_cookies[key].file(self.fullpath)) self.magic = tuple(magic_result) return self.magic def check(self): """Return True if the checksum corresponds to the file.""" return calculate_md5(self.fullpath) == self.checksum def stream(self, req): """Stream the file.""" if self.status: (auth_code, auth_message) = acc_authorize_action(req, 'viewrestrdoc', status=self.status) else: auth_code = 0 if auth_code == 0: if os.path.exists(self.fullpath): if random.random() < 0.25 and calculate_md5(self.fullpath) != self.checksum: raise InvenioWebSubmitFileError, "File %s, version %i, for record %s is corrupted!" % (self.fullname, self.version, self.recid) stream_file(req, self.fullpath, self.fullname, self.mime, self.encoding, self.etag, self.checksum, self.fullurl) raise apache.SERVER_RETURN, apache.DONE else: req.status = apache.HTTP_NOT_FOUND raise InvenioWebSubmitFileError, "%s does not exists!" % self.fullpath else: raise InvenioWebSubmitFileError, "You are not authorized to download %s: %s" % (self.fullname, auth_message) def stream_file(req, fullpath, fullname=None, mime=None, encoding=None, etag=None, md5=None, location=None): """This is a generic function to stream a file to the user. If fullname, mime, encoding, and location are not provided they will be guessed based on req and fullpath. md5 should be passed as an hexadecimal string. """ def normal_streaming(size): req.set_content_length(size) req.send_http_header() if not req.header_only: req.sendfile(fullpath) return "" def single_range(size, the_range): req.set_content_length(the_range[1]) req.headers_out['Content-Range'] = 'bytes %d-%d/%d' % (the_range[0], the_range[0] + the_range[1] - 1, size) req.status = apache.HTTP_PARTIAL_CONTENT req.send_http_header() if not req.header_only: req.sendfile(fullpath, the_range[0], the_range[1]) return "" def multiple_ranges(size, ranges, mime): req.status = apache.HTTP_PARTIAL_CONTENT boundary = '%s%04d' % (time.strftime('THIS_STRING_SEPARATES_%Y%m%d%H%M%S'), random.randint(0, 9999)) req.content_type = 'multipart/byteranges; boundary=%s' % boundary content_length = 0 for arange in ranges: content_length += len('--%s\r\n' % boundary) content_length += len('Content-Type: %s\r\n' % mime) content_length += len('Content-Range: bytes %d-%d/%d\r\n' % (arange[0], arange[0] + arange[1] - 1, size)) content_length += len('\r\n') content_length += arange[1] content_length += len('\r\n') content_length += len('--%s--\r\n' % boundary) req.set_content_length(content_length) req.send_http_header() if not req.header_only: for arange in ranges: req.write('--%s\r\n' % boundary, 0) req.write('Content-Type: %s\r\n' % mime, 0) req.write('Content-Range: bytes %d-%d/%d\r\n' % (arange[0], arange[0] + arange[1] - 1, size), 0) req.write('\r\n', 0) req.sendfile(fullpath, arange[0], arange[1]) req.write('\r\n', 0) req.write('--%s--\r\n' % boundary) req.flush() return "" def parse_date(date): """According to <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3> a date can come in three formats (in order of preference): Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format Moreover IE is adding some trailing information after a ';'. Wrong dates should be simpled ignored. This function return the time in seconds since the epoch GMT or None in case of errors.""" if not date: return None try: date = date.split(';')[0].strip() # Because of IE ## Sun, 06 Nov 1994 08:49:37 GMT return time.mktime(time.strptime(date, '%a, %d %b %Y %X %Z')) except: try: ## Sun, 06 Nov 1994 08:49:37 GMT return time.mktime(time.strptime(date, '%A, %d-%b-%y %H:%M:%S %Z')) except: try: ## Sun, 06 Nov 1994 08:49:37 GMT return time.mktime(date) except: return None def parse_ranges(ranges): """According to <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35> a (multiple) range request comes in the form: bytes=20-30,40-60,70-,-80 with the meaning: from byte to 20 to 30 inclusive (11 bytes) from byte to 40 to 60 inclusive (21 bytes) from byte 70 to (size - 1) inclusive (size - 70 bytes) from byte size - 80 to (size - 1) inclusive (80 bytes) This function will return the list of ranges in the form: [[first_byte, last_byte], ...] If first_byte or last_byte aren't specified they'll be set to None If the list is not well formatted it will return None """ try: if ranges.startswith('bytes') and '=' in ranges: ranges = ranges.split('=')[1].strip() else: return None ret = [] for arange in ranges.split(','): arange = arange.strip() if arange.startswith('-'): ret.append([None, int(arange[1:])]) elif arange.endswith('-'): ret.append([int(arange[:-1]), None]) else: ret.append(map(int, arange.split('-'))) return ret except: return None def parse_tags(tags): """Return a list of tags starting from a comma separated list.""" return [tag.strip() for tag in tags.split(',')] def fix_ranges(ranges, size): """Complementary to parse_ranges it will transform all the ranges into (first_byte, length), adjusting all the value based on the actual size provided. """ ret = [] for arange in ranges: if (arange[0] is None and arange[1] > 0) or arange[0] < size: if arange[0] is None: arange[0] = size - arange[1] elif arange[1] is None: arange[1] = size - arange[0] else: arange[1] = arange[1] - arange[0] + 1 arange[0] = max(0, arange[0]) arange[1] = min(size - arange[0], arange[1]) if arange[1] > 0: ret.append(arange) return ret def get_normalized_headers(headers): """Strip and lowerize all the keys of the headers dictionary plus strip, lowerize and transform known headers value into their value.""" ret = { 'if-match' : None, 'unless-modified-since' : None, 'if-modified-since' : None, 'range' : None, 'if-range' : None, 'if-none-match' : None, } for key, value in req.headers_in.iteritems(): key = key.strip().lower() value = value.strip() if key in ('unless-modified-since', 'if-modified-since'): value = parse_date(value) elif key == 'range': value = parse_ranges(value) elif key == 'if-range': value = parse_date(value) or parse_tags(value) elif key in ('if-match', 'if-none-match'): value = parse_tags(value) if value: ret[key] = value return ret headers = get_normalized_headers(req.headers_in) if headers['if-match']: if etag is not None and etag not in headers['if-match']: raise apache.SERVER_RETURN, apache.HTTP_PRECONDITION_FAILED if os.path.exists(fullpath): mtime = os.path.getmtime(fullpath) if fullname is None: fullname = os.path.basename(fullpath) if mime is None: format = decompose_file(fullpath)[2] (mime, encoding) = _mimes.guess_type(fullpath) if mime is None: mime = "application/octet-stream" if location is None: location = req.uri req.content_type = mime req.encoding = encoding req.filename = fullname req.headers_out["Last-Modified"] = time.strftime('%a, %d %b %Y %X GMT', time.gmtime(mtime)) req.headers_out["Accept-Ranges"] = "bytes" req.headers_out["Content-Location"] = location if etag is not None: req.headers_out["ETag"] = etag if md5 is not None: req.headers_out["Content-MD5"] = base64.encodestring(binascii.unhexlify(md5.upper()))[:-1] req.headers_out["Content-Disposition"] = 'inline; filename="%s"' % fullname.replace('"', '\\"') size = os.path.getsize(fullpath) if not size: try: raise Exception, '%s exists but is empty' % fullpath except Exception: register_exception(req=req, alert_admin=True) raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND if headers['if-modified-since'] and headers['if-modified-since'] >= mtime: raise apache.SERVER_RETURN, apache.HTTP_NOT_MODIFIED if headers['if-none-match']: if etag is not None and etag in headers['if-none-match']: raise apache.SERVER_RETURN, apache.HTTP_NOT_MODIFIED if headers['unless-modified-since'] and headers['unless-modified-since'] < mtime: return normal_streaming(size) if headers['range']: try: if headers['if-range']: if etag is None or etag not in headers['if-range']: return normal_streaming(size) ranges = fix_ranges(headers['range'], size) except: return normal_streaming(size) if len(ranges) > 1: return multiple_ranges(size, ranges, mime) elif ranges: return single_range(size, ranges[0]) else: raise apache.SERVER_RETURN, apache.HTTP_RANGE_NOT_SATISFIABLE else: return normal_streaming(size) else: raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND def stream_restricted_icon(req): """Return the content of the "Restricted Icon" file.""" stream_file(req, '%s/img/restricted.gif' % CFG_WEBDIR) raise apache.SERVER_RETURN, apache.DONE def list_types_from_array(bibdocs): """Retrieves the list of types from the given bibdoc list.""" types = [] for bibdoc in bibdocs: if not bibdoc.get_type() in types: types.append(bibdoc.get_type()) return types def list_versions_from_array(docfiles): """Retrieve the list of existing versions from the given docfiles list.""" versions = [] for docfile in docfiles: if not docfile.get_version() in versions: versions.append(docfile.get_version()) return versions def order_files_with_version(docfile1, docfile2): """order docfile objects according to their version""" version1 = docfile1.get_version() version2 = docfile2.get_version() return cmp(version2, version1) def _make_base_dir(docid): """Given a docid it returns the complete path that should host its files.""" group = "g" + str(int(int(docid) / CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT)) return os.path.join(CFG_WEBSUBMIT_FILEDIR, group, str(docid)) class Md5Folder: """Manage all the Md5 checksum about a folder""" def __init__(self, folder): """Initialize the class from the md5 checksum of a given path""" self.folder = folder try: self.load() except InvenioWebSubmitFileError: self.md5s = {} self.update() def update(self, only_new = True): """Update the .md5 file with the current files. If only_new is specified then only not already calculated file are calculated.""" if not only_new: self.md5s = {} if os.path.exists(self.folder): for filename in os.listdir(self.folder): if filename not in self.md5s and not filename.startswith('.'): self.md5s[filename] = calculate_md5(os.path.join(self.folder, filename)) self.store() def store(self): """Store the current md5 dictionary into .md5""" try: old_umask = os.umask(022) md5file = open(os.path.join(self.folder, ".md5"), "w") for key, value in self.md5s.items(): md5file.write('%s *%s\n' % (value, key)) md5file.close() os.umask(old_umask) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while storing .md5 for folder '%s': '%s'" % (self.folder, e) def load(self): """Load .md5 into the md5 dictionary""" self.md5s = {} try: md5file = open(os.path.join(self.folder, ".md5"), "r") for row in md5file: md5hash = row[:32] filename = row[34:].strip() self.md5s[filename] = md5hash md5file.close() except IOError: self.update() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while loading .md5 for folder '%s': '%s'" % (self.folder, e) def check(self, filename = ''): """Check the specified file or all the files for which it exists a hash for being coherent with the stored hash.""" if filename and filename in self.md5s.keys(): try: return self.md5s[filename] == calculate_md5(os.path.join(self.folder, filename)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while loading '%s': '%s'" % (os.path.join(self.folder, filename), e) else: for filename, md5hash in self.md5s.items(): try: if calculate_md5(os.path.join(self.folder, filename)) != md5hash: return False except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while loading '%s': '%s'" % (os.path.join(self.folder, filename), e) return True def get_checksum(self, filename): """Return the checksum of a physical file.""" md5hash = self.md5s.get(filename, None) if md5hash is None: self.update() # Now it should not fail! md5hash = self.md5s[filename] return md5hash def calculate_md5_external(filename): """Calculate the md5 of a physical file through md5sum Command Line Tool. This is suitable for file larger than 256Kb.""" try: md5_result = os.popen(CFG_PATH_MD5SUM + ' -b %s' % escape_shell_arg(filename)) ret = md5_result.read()[:32] md5_result.close() if len(ret) != 32: # Error in running md5sum. Let's fallback to internal # algorithm. return calculate_md5(filename, force_internal=True) else: return ret except Exception, e: raise InvenioWebSubmitFileError, "Encountered an exception while calculating md5 for file '%s': '%s'" % (filename, e) def calculate_md5(filename, force_internal=False): """Calculate the md5 of a physical file. This is suitable for files smaller than 256Kb.""" if not CFG_PATH_MD5SUM or force_internal or os.path.getsize(filename) < CFG_BIBDOCFILE_MD5_THRESHOLD: try: to_be_read = open(filename, "rb") computed_md5 = md5() while True: buf = to_be_read.read(CFG_BIBDOCFILE_MD5_BUFFER) if buf: computed_md5.update(buf) else: break to_be_read.close() return computed_md5.hexdigest() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while calculating md5 for file '%s': '%s'" % (filename, e) else: return calculate_md5_external(filename) def bibdocfile_url_to_bibrecdocs(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/record/xxx/files/... it returns a BibRecDocs object for the corresponding recid.""" recid = decompose_bibdocfile_url(url)[0] return BibRecDocs(recid) def bibdocfile_url_to_bibdoc(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/record/xxx/files/... it returns a BibDoc object for the corresponding recid/docname.""" docname = decompose_bibdocfile_url(url)[1] return bibdocfile_url_to_bibrecdocs(url).get_bibdoc(docname) def bibdocfile_url_to_bibdocfile(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/record/xxx/files/... it returns a BibDocFile object for the corresponding recid/docname/format.""" dummy, dummy, format = decompose_bibdocfile_url(url) return bibdocfile_url_to_bibdoc(url).get_file(format) def bibdocfile_url_to_fullpath(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/record/xxx/files/... it returns the fullpath for the corresponding recid/docname/format.""" return bibdocfile_url_to_bibdocfile(url).get_full_path() def bibdocfile_url_p(url): """Return True when the url is a potential valid url pointing to a fulltext owned by a system.""" if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL): return True if not (url.startswith('%s/record/' % CFG_SITE_URL) or url.startswith('%s/record/' % CFG_SITE_SECURE_URL)): return False splitted_url = url.split('/files/') return len(splitted_url) == 2 and splitted_url[0] != '' and splitted_url[1] != '' def decompose_bibdocfile_url(url): """Given a bibdocfile_url return a triple (recid, docname, format).""" if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL): return decompose_bibdocfile_very_old_url(url) if url.startswith('%s/record/' % CFG_SITE_URL): recid_file = url[len('%s/record/' % CFG_SITE_URL):] elif url.startswith('%s/record/' % CFG_SITE_SECURE_URL): recid_file = url[len('%s/record/' % CFG_SITE_SECURE_URL):] else: raise InvenioWebSubmitFileError, "Url %s doesn't correspond to a valid record inside the system." % url recid_file = recid_file.replace('/files/', '/') recid, docname, format = decompose_file(urllib.unquote(recid_file)) if not recid and docname.isdigit(): ## If the URL was something similar to CFG_SITE_URL/record/123 return (int(docname), '', '') return (int(recid), docname, format) re_bibdocfile_old_url = re.compile(r'/record/(\d*)/files/') def decompose_bibdocfile_old_url(url): """Given a bibdocfile old url (e.g. CFG_SITE_URL/record/123/files) it returns the recid.""" g = re_bibdocfile_old_url.search(url) if g: return int(g.group(1)) raise InvenioWebSubmitFileError('%s is not a valid old bibdocfile url' % url) def decompose_bibdocfile_very_old_url(url): """Decompose an old /getfile.py? URL""" if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL): params = urllib.splitquery(url)[1] if params: try: params = cgi.parse_qs(params) if 'docid' in params: docid = int(params['docid'][0]) bibdoc = BibDoc(docid) recid = bibdoc.get_recid() docname = bibdoc.get_docname() elif 'recid' in params: recid = int(params['recid'][0]) if 'name' in params: docname = params['name'][0] else: docname = '' else: raise InvenioWebSubmitFileError('%s has not enough params to correspond to a bibdocfile.' % url) format = normalize_format(params.get('format', [''])[0]) return (recid, docname, format) except Exception, e: raise InvenioWebSubmitFileError('Problem with %s: %s' % (url, e)) else: raise InvenioWebSubmitFileError('%s has no params to correspond to a bibdocfile.' % url) else: raise InvenioWebSubmitFileError('%s is not a valid very old bibdocfile url' % url) def nice_size(size): """Return a nicely printed size in kilo.""" unit = 'B' if size > 1024: size /= 1024.0 unit = 'KB' if size > 1024: size /= 1024.0 unit = 'MB' if size > 1024: size /= 1024.0 unit = 'GB' return '%s %s' % (websearch_templates.tmpl_nice_number(size, max_ndigits_after_dot=2), unit) def get_docname_from_url(url): """Return a potential docname given a url""" path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] filename = os.path.split(path)[-1] return file_strip_ext(filename) def get_format_from_url(url): """Return a potential format given a url""" path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] filename = os.path.split(path)[-1] return filename[len(file_strip_ext(filename)):] def clean_url(url): """Given a local url e.g. a local path it render it a realpath.""" protocol = urllib2.urlparse.urlsplit(url)[0] if protocol in ('', 'file'): path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] return os.path.abspath(path) else: return url def check_valid_url(url): """Check for validity of a url or a file.""" try: protocol = urllib2.urlparse.urlsplit(url)[0] if protocol in ('', 'file'): path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] if os.path.abspath(path) != path: raise StandardError, "%s is not a normalized path (would be %s)." % (path, os.path.normpath(path)) for allowed_path in CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS + [CFG_TMPDIR, CFG_WEBSUBMIT_STORAGEDIR]: if path.startswith(allowed_path): dummy_fd = open(path) dummy_fd.close() return raise StandardError, "%s is not in one of the allowed paths." % path else: urllib2.urlopen(url) except Exception, e: raise StandardError, "%s is not a correct url: %s" % (url, e) def safe_mkstemp(suffix): """Create a temporary filename that don't have any '.' inside a part from the suffix.""" tmpfd, tmppath = tempfile.mkstemp(suffix=suffix, dir=CFG_TMPDIR) if '.' not in suffix: # Just in case format is empty return tmpfd, tmppath while '.' in os.path.basename(tmppath)[:-len(suffix)]: os.close(tmpfd) os.remove(tmppath) tmpfd, tmppath = tempfile.mkstemp(suffix=suffix, dir=CFG_TMPDIR) return (tmpfd, tmppath) def download_url(url, format, user=None, password=None, sleep=2): """Download a url (if it corresponds to a remote file) and return a local url to it.""" class my_fancy_url_opener(urllib.FancyURLopener): def __init__(self, user, password): urllib.FancyURLopener.__init__(self) self.fancy_user = user self.fancy_password = password def prompt_user_passwd(self, host, realm): return (self.fancy_user, self.fancy_password) format = normalize_format(format) protocol = urllib2.urlparse.urlsplit(url)[0] tmpfd, tmppath = safe_mkstemp(format) try: try: if protocol in ('', 'file'): path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] if os.path.abspath(path) != path: raise StandardError, "%s is not a normalized path (would be %s)." % (path, os.path.normpath(path)) for allowed_path in CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS + [CFG_TMPDIR, CFG_WEBSUBMIT_STORAGEDIR]: if path.startswith(allowed_path): shutil.copy(path, tmppath) if os.path.getsize(tmppath) > 0: return tmppath else: raise StandardError, "%s seems to be empty" % url raise StandardError, "%s is not in one of the allowed paths." % path else: if user is not None: urlopener = my_fancy_url_opener(user, password) urlopener.retrieve(url, tmppath) else: urllib.urlretrieve(url, tmppath) #cmd_exit_code, cmd_out, cmd_err = run_shell_command(CFG_PATH_WGET + ' %s -O %s -t 2 -T 40', #(url, tmppath)) #if cmd_exit_code: #raise StandardError, "It's impossible to download %s: %s" % (url, cmd_err) if os.path.getsize(tmppath) > 0: return tmppath else: raise StandardError, "%s seems to be empty" % url except: os.remove(tmppath) raise finally: os.close(tmpfd) class BibDocMoreInfo: """Class to wrap the serialized bibdoc more_info. At the moment it stores descriptions and comments for each .""" def __init__(self, docid, more_info=None): try: assert(type(docid) in (long, int) and docid > 0) self.docid = docid try: if more_info is None: res = run_sql('SELECT more_info FROM bibdoc WHERE id=%s', (docid, )) if res and res[0][0]: self.more_info = cPickle.loads(blob_to_string(res[0][0])) else: self.more_info = {} else: self.more_info = cPickle.loads(more_info) except: self.more_info = {} if 'descriptions' not in self.more_info: self.more_info['descriptions'] = {} if 'comments' not in self.more_info: self.more_info['comments'] = {} if 'hidden' not in self.more_info: self.more_info['hidden'] = {} except: register_exception() raise def flush(self): """if __dirty is True reserialize di DB.""" run_sql('UPDATE bibdoc SET more_info=%s WHERE id=%s', (cPickle.dumps(self.more_info), self.docid)) def get_comment(self, format, version): """Return the comment corresponding to the given docid/format/version.""" try: assert(type(version) is int) format = normalize_format(format) return self.more_info['comments'].get(version, {}).get(format) except: register_exception() raise def get_description(self, format, version): """Return the description corresponding to the given docid/format/version.""" try: assert(type(version) is int) format = normalize_format(format) return self.more_info['descriptions'].get(version, {}).get(format) except: register_exception() raise def hidden_p(self, format, version): """Is the format/version hidden?""" try: assert(type(version) is int) format = normalize_format(format) return self.more_info['hidden'].get(version, {}).get(format, False) except: register_exception() raise def set_comment(self, comment, format, version): """Store a comment corresponding to the given docid/format/version.""" try: assert(type(version) is int and version > 0) format = normalize_format(format) if comment == KEEP_OLD_VALUE: comment = self.get_comment(format, version) or self.get_comment(format, version - 1) if not comment: self.unset_comment(format, version) self.flush() return if not version in self.more_info['comments']: self.more_info['comments'][version] = {} self.more_info['comments'][version][format] = comment self.flush() except: register_exception() raise def set_description(self, description, format, version): """Store a description corresponding to the given docid/format/version.""" try: assert(type(version) is int and version > 0) format = normalize_format(format) if description == KEEP_OLD_VALUE: description = self.get_description(format, version) or self.get_description(format, version - 1) if not description: self.unset_description(format, version) self.flush() return if not version in self.more_info['descriptions']: self.more_info['descriptions'][version] = {} self.more_info['descriptions'][version][format] = description self.flush() except: register_exception() raise def set_hidden(self, hidden, format, version): """Store wethever the docid/format/version is hidden.""" try: assert(type(version) is int and version > 0) format = normalize_format(format) if not hidden: self.unset_hidden(format, version) self.flush() return if not version in self.more_info['hidden']: self.more_info['hidden'][version] = {} self.more_info['hidden'][version][format] = hidden self.flush() except: register_exception() raise def unset_comment(self, format, version): """Remove a comment.""" try: assert(type(version) is int and version > 0) del self.more_info['comments'][version][format] self.flush() except KeyError: pass except: register_exception() raise def unset_description(self, format, version): """Remove a description.""" try: assert(type(version) is int and version > 0) del self.more_info['descriptions'][version][format] self.flush() except KeyError: pass except: register_exception() raise def unset_hidden(self, format, version): """Remove hidden flag.""" try: assert(type(version) is int and version > 0) del self.more_info['hidden'][version][format] self.flush() except KeyError: pass except: register_exception() raise def serialize(self): """Return the serialized version of the more_info.""" return cPickle.dumps(self.more_info) def readfile(filename): """Try to read a file. Return '' in case of any error. This function is useful for quick implementation of websubmit functions. """ try: fd = open(filename) content = fd.read() fd.close() return content except: return '' diff --git a/modules/websubmit/lib/websubmit_engine.py b/modules/websubmit/lib/websubmit_engine.py index 83470e8a1..48af4271e 100644 --- a/modules/websubmit/lib/websubmit_engine.py +++ b/modules/websubmit/lib/websubmit_engine.py @@ -1,1641 +1,1636 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """WebSubmit: the mechanism for the submission of new records into CDS Invenio via a Web interface. """ __revision__ = "$Id$" ## import interesting modules: import string import os import sys import time import types import re from urllib import quote_plus from cgi import escape -try: - from mod_python import apache -except ImportError: - pass - from invenio.config import \ CFG_BINDIR, \ CFG_SITE_LANG, \ CFG_SITE_NAME, \ CFG_SITE_URL, \ CFG_PYLIBDIR, \ CFG_WEBSUBMIT_STORAGEDIR, \ CFG_VERSION from invenio.dbquery import run_sql, Error from invenio.access_control_engine import acc_authorize_action from invenio.access_control_admin import acc_is_role from invenio.webpage import page, create_error_box from invenio.webuser import getUid, get_email, collect_user_info from invenio.websubmit_config import * from invenio.messages import gettext_set_language, wash_language from invenio.errorlib import register_exception from websubmit_dblayer import \ get_storage_directory_of_action, \ get_longname_of_doctype, \ get_longname_of_action, \ get_num_pages_of_submission, \ get_parameter_value_for_doctype, \ submission_exists_in_log, \ log_new_pending_submission, \ log_new_completed_submission, \ update_submission_modified_date_in_log, \ update_submission_reference_in_log, \ update_submission_reference_and_status_in_log, \ get_form_fields_on_submission_page, \ get_element_description, \ get_element_check_description, \ get_form_fields_not_on_submission_page, \ function_step_is_last, \ get_collection_children_of_submission_collection, \ get_submission_collection_name, \ get_doctype_children_of_submission_collection, \ get_categories_of_doctype, \ get_doctype_details, \ get_actions_on_submission_page_for_doctype, \ get_action_details, \ get_parameters_of_function, \ get_details_of_submission, \ get_functions_for_submission_step, \ get_submissions_at_level_X_with_score_above_N, \ submission_is_finished import invenio.template websubmit_templates = invenio.template.load('websubmit') def interface(req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG, doctype="", act="", startPg=1, access="", mainmenu="", fromdir="", nextPg="", nbPg="", curpage=1): """This function is called after a user has visited a document type's "homepage" and selected the type of "action" to perform. Having clicked an action-button (e.g. "Submit a New Record"), this function will be called . It performs the task of initialising a new submission session (retrieving information about the submission, creating a working submission-directory, etc), and "drawing" a submission page containing the WebSubmit form that the user uses to input the metadata to be submitted. When a user moves between pages in the submission interface, this function is recalled so that it can save the metadata entered into the previous page by the user, and draw the current submission-page. Note: During a submission, for each page refresh, this function will be called while the variable "step" (a form variable, seen by websubmit_webinterface, which calls this function) is 0 (ZERO). In other words, this function handles the FRONT-END phase of a submission, BEFORE the WebSubmit functions are called. @param req: (apache request object) *** NOTE: Added into this object, is a variable called "form" (req.form). This is added into the object in the index function of websubmit_webinterface. It contains a "mod_python.util.FieldStorage" instance, that contains the form-fields found on the previous submission page. @param c: (string), defaulted to CFG_SITE_NAME. The name of the CDS Invenio installation. @param ln: (string), defaulted to CFG_SITE_LANG. The language in which to display the pages. @param doctype: (string) - the doctype ID of the doctype for which the submission is being made. @param act: (string) - The ID of the action being performed (e.g. submission of bibliographic information; modification of bibliographic information, etc). @param startPg: (integer) - Starting page for the submission? Defaults to 1. @param indir: (string) - the directory used to store all submissions of the given "type" of this submission. For example, if the submission is of the type "modify bibliographic information", this variable would contain "modify". @param access: (string) - the "access" number for the submission (e.g. 1174062451_7010). This number is also used as the name for the current working submission directory. @param mainmenu: (string) - contains the URL (minus the CDS Invenio home stem) for the submission's home-page. (E.g. If this submission is "PICT", the "mainmenu" file would contain "/submit?doctype=PICT". @param fromdir: (integer) @param nextPg: (string) @param nbPg: (string) @param curpage: (integer) - the current submission page number. Defaults to 1. """ ln = wash_language(ln) # load the right message language _ = gettext_set_language(ln) sys.stdout = req # get user ID: uid = getUid(req) uid_email = get_email(uid) # variable initialisation t = "" field = [] fieldhtml = [] level = [] fullDesc = [] text = [] check = [] select = [] radio = [] upload = [] txt = [] noPage = [] # Preliminary tasks # check that the user is logged in if not uid_email or uid_email == "guest": return warningMsg(websubmit_templates.tmpl_warning_message( ln = ln, msg = _("Sorry, you must log in to perform this action.") ), req, ln) # warningMsg("""<center><font color="red"></font></center>""",req, ln) # check we have minimum fields if not doctype or not act or not access: ## We don't have all the necessary information to go ahead ## with this submission: return warningMsg(_("Not enough information to go ahead with the submission."), req, c, ln) try: assert(not access or re.match('\d+_\d+', access)) except AssertionError: register_exception(req=req, prefix='doctype="%s", access="%s"' % (doctype, access)) return warningMsg(_("Invalid parameters"), req, c, ln) if doctype and act: ## Let's clean the input details = get_details_of_submission(doctype, act) if not details: return warningMsg(_("Invalid doctype and act parameters"), req, c, ln) doctype = details[0] act = details[1] ## Before continuing to display the submission form interface, ## verify that this submission has not already been completed: if submission_is_finished(doctype, act, access, uid_email): ## This submission has already been completed. ## This situation can arise when, having completed a submission, ## the user uses the browser's back-button to go back to the form ## stage of the submission and then tries to submit once more. ## This is unsafe and should not be allowed. Instead of re-displaying ## the submission forms, display an error message to the user: wrnmsg = """<b>This submission has been completed. Please go to the""" \ """ <a href="/submit?doctype=%(doctype)s&ln=%(ln)s">""" \ """main menu</a> to start a new submission.</b>""" \ % { 'doctype' : quote_plus(doctype), 'ln' : ln } return warningMsg(wrnmsg, req) ## retrieve the action and doctype data: ## Concatenate action ID and doctype ID to make the submission ID: subname = "%s%s" % (act, doctype) ## Get the submission storage directory from the DB: submission_dir = get_storage_directory_of_action(act) if submission_dir: indir = submission_dir else: ## Unable to determine the submission-directory: return warningMsg(_("Unable to find the submission directory for the action: %s") % escape(str(act)), req, c, ln) ## get the document type's long-name: doctype_lname = get_longname_of_doctype(doctype) if doctype_lname is not None: ## Got the doctype long-name: replace spaces with HTML chars: docname = doctype_lname.replace(" ", " ") else: ## Unknown document type: return warningMsg(_("Unknown document type"), req, c, ln) ## get the action's long-name: actname = get_longname_of_action(act) if actname is None: ## Unknown action: return warningMsg(_("Unknown action"), req, c, ln) ## Get the number of pages for this submission: num_submission_pages = get_num_pages_of_submission(subname) if num_submission_pages is not None: nbpages = num_submission_pages else: ## Unable to determine the number of pages for this submission: return warningMsg(_("Unable to determine the number of submission pages."), req, c, ln) ## If unknown, get the current page of submission: if startPg != "" and curpage in ("", 0): curpage = startPg ## retrieve the name of the file in which the reference of ## the submitted document will be stored rn_filename = get_parameter_value_for_doctype(doctype, "edsrn") if rn_filename is not None: edsrn = rn_filename else: ## Unknown value for edsrn - set it to an empty string: edsrn = "" ## This defines the path to the directory containing the action data curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, indir, doctype, access) try: assert(curdir == os.path.abspath(curdir)) except AssertionError: register_exception(req=req, prefix='indir="%s", doctype="%s", access="%s"' % (indir, doctype, access)) return warningMsg(_("Invalid parameters"), req, c, ln) ## if this submission comes from another one (fromdir is then set) ## We retrieve the previous submission directory and put it in the proper one if fromdir != "": olddir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, fromdir, doctype, access) try: assert(olddir == os.path.abspath(olddir)) except AssertionError: register_exception(req=req, prefix='fromdir="%s", doctype="%s", access="%s"' % (fromdir, doctype, access)) return warningMsg(_("Invalid parameters"), req, c, ln) if os.path.exists(olddir): os.rename(olddir, curdir) ## If the submission directory still does not exist, we create it if not os.path.exists(curdir): try: os.makedirs(curdir) except Exception, e: register_exception(req=req, alert_admin=True) return warningMsg(_("Unable to create a directory for this submission. The administrator has been alerted."), req, c, ln) # retrieve the original main menu url and save it in the "mainmenu" file if mainmenu != "": fp = open(os.path.join(curdir, "mainmenu"), "w") fp.write(mainmenu) fp.close() # and if the file containing the URL to the main menu exists # we retrieve it and store it in the $mainmenu variable if os.path.exists(os.path.join(curdir, "mainmenu")): fp = open(os.path.join(curdir, "mainmenu"), "r"); mainmenu = fp.read() fp.close() else: mainmenu = "%s/submit" % (CFG_SITE_URL,) # various authentication related tasks... if uid_email != "guest" and uid_email != "": #First save the username (email address) in the SuE file. This way bibconvert will be able to use it if needed fp = open(os.path.join(curdir, "SuE"), "w") fp.write(uid_email) fp.close() # is user authorized to perform this action? (auth_code, auth_message) = acc_authorize_action(req, "submit", verbose=0, doctype=doctype, act=act) if acc_is_role("submit", doctype=doctype, act=act) and auth_code != 0: return warningMsg("""<center><font color="red">%s</font></center>""" % auth_message, req) ## update the "journal of submission": ## Does the submission already exist in the log? submission_exists = \ submission_exists_in_log(doctype, act, access, uid_email) if submission_exists == 1: ## update the modification-date of this submission in the log: update_submission_modified_date_in_log(doctype, act, access, uid_email) else: ## Submission doesn't exist in log - create it: log_new_pending_submission(doctype, act, access, uid_email) ## Let's write in curdir file under curdir the curdir value ## in case e.g. it is needed in FFT. fp = open(os.path.join(curdir, "curdir"), "w") fp.write(curdir) fp.close() # Save the form fields entered in the previous submission page # If the form was sent with the GET method form = dict(req.form) value = "" # we parse all the form variables for key, formfields in form.items(): filename = key.replace("[]", "") file_to_open = os.path.join(curdir, filename) try: assert(file_to_open == os.path.abspath(file_to_open)) except AssertionError: register_exception(req=req, prefix='curdir="%s", filename="%s"' % (curdir, filename)) return warningMsg(_("Invalid parameters"), req, c, ln) # the field is an array if isinstance(formfields, types.ListType): fp = open(file_to_open, "w") for formfield in formfields: #stripslashes(value) value = specialchars(formfield) fp.write(value+"\n") fp.close() # the field is a normal string elif isinstance(formfields, types.StringTypes) and formfields != "": value = formfields fp = open(file_to_open, "w") fp.write(specialchars(value)) fp.close() # the field is a file elif hasattr(formfields,"filename") and formfields.filename: dir_to_open = os.path.join(curdir, 'files', key) try: assert(dir_to_open == os.path.abspath(dir_to_open)) assert(dir_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR)) except AssertionError: register_exception(req=req, prefix='curdir="%s", key="%s"' % (curdir, key)) return warningMsg(_("Invalid parameters"), req, c, ln) if not os.path.exists(dir_to_open): try: os.makedirs(dir_to_open) except: register_exception(req=req, alert_admin=True) return warningMsg(_("Cannot create submission directory. The administrator has been alerted."), req, c, ln) filename = formfields.filename ## Before saving the file to disc, wash the filename (in particular ## washing away UNIX and Windows (e.g. DFS) paths): filename = os.path.basename(filename.split('\\')[-1]) filename = filename.strip() if filename != "": # This may be dangerous if the file size is bigger than the available memory fp = open(os.path.join(dir_to_open, filename), "w") while formfields.file: fp.write(formfields.file.read(10240)) fp.close() fp = open(os.path.join(curdir, "lastuploadedfile"), "w") fp.write(filename) fp.close() fp = open(file_to_open, "w") fp.write(filename) fp.close() else: return warningMsg(_("No file uploaded?"), req, c, ln) ## if the found field is the reference of the document, ## save this value in the "journal of submissions": if uid_email != "" and uid_email != "guest": if key == edsrn: update_submission_reference_in_log(doctype, access, uid_email, value) ## create the interface: subname = "%s%s" % (act, doctype) ## Get all of the form fields that appear on this page, ordered by fieldnum: form_fields = get_form_fields_on_submission_page(subname, curpage) full_fields = [] values = [] for field_instance in form_fields: full_field = {} ## Retrieve the field's description: element_descr = get_element_description(field_instance[3]) try: assert(element_descr is not None) except AssertionError: msg = _("Unknown form field found on submission page.") register_exception(req=req, alert_admin=True, prefix=msg) ## The form field doesn't seem to exist - return with error message: return warningMsg(_("Unknown form field found on submission page."), req, c, ln) if element_descr[8] is None: val = "" else: val = element_descr[8] ## we also retrieve and add the javascript code of the checking function, if needed ## Set it to empty string to begin with: full_field['javascript'] = '' if field_instance[7] != '': check_descr = get_element_check_description(field_instance[7]) if check_descr is not None: ## Retrieved the check description: full_field['javascript'] = check_descr full_field['type'] = element_descr[3] full_field['name'] = field_instance[3] full_field['rows'] = element_descr[5] full_field['cols'] = element_descr[6] full_field['val'] = val full_field['size'] = element_descr[4] full_field['maxlength'] = element_descr[7] full_field['htmlcode'] = element_descr[9] full_field['typename'] = field_instance[1] ## TODO: Investigate this, Not used? ## It also seems to refer to pagenum. # The 'R' fields must be executed in the engine's environment, # as the runtime functions access some global and local # variables. if full_field ['type'] == 'R': co = compile (full_field ['htmlcode'].replace("\r\n","\n"), "<string>", "exec") exec(co) else: text = websubmit_templates.tmpl_submit_field (ln = ln, field = full_field) # we now determine the exact type of the created field if full_field['type'] not in [ 'D','R']: field.append(full_field['name']) level.append(field_instance[5]) fullDesc.append(field_instance[4]) txt.append(field_instance[6]) check.append(field_instance[7]) # If the field is not user-defined, we try to determine its type # (select, radio, file upload...) # check whether it is a select field or not if re.search("SELECT", text, re.IGNORECASE) is not None: select.append(1) else: select.append(0) # checks whether it is a radio field or not if re.search(r"TYPE=[\"']?radio", text, re.IGNORECASE) is not None: radio.append(1) else: radio.append(0) # checks whether it is a file upload or not if re.search(r"TYPE=[\"']?file", text, re.IGNORECASE) is not None: upload.append(1) else: upload.append(0) # if the field description contains the "<COMBO>" string, replace # it by the category selected on the document page submission page combofile = "combo%s" % doctype if os.path.exists("%s/%s" % (curdir, combofile)): f = open("%s/%s" % (curdir, combofile), "r") combo = f.read() f.close() else: combo="" text = text.replace("<COMBO>", combo) # if there is a <YYYY> tag in it, replace it by the current year year = time.strftime("%Y"); text = text.replace("<YYYY>", year) # if there is a <TODAY> tag in it, replace it by the current year today = time.strftime("%d/%m/%Y"); text = text.replace("<TODAY>", today) fieldhtml.append(text) else: select.append(0) radio.append(0) upload.append(0) # field.append(value) - initial version, not working with JS, taking a submitted value field.append(field_instance[3]) level.append(field_instance[5]) txt.append(field_instance[6]) fullDesc.append(field_instance[4]) check.append(field_instance[7]) fieldhtml.append(text) full_field['fullDesc'] = field_instance[4] full_field['text'] = text # If a file exists with the name of the field we extract the saved value text = '' if os.path.exists(os.path.join(curdir, full_field['name'])): file = open(os.path.join(curdir, full_field['name']), "r"); text = file.read() text = re.compile("[\n\r]*$").sub("", text) text = re.compile("\n").sub("\\n", text) text = re.compile("\r").sub("", text) file.close() values.append(text) full_fields.append(full_field) returnto = {} if int(curpage) == int(nbpages): subname = "%s%s" % (act, doctype) other_form_fields = \ get_form_fields_not_on_submission_page(subname, curpage) nbFields = 0 message = "" fullcheck_select = [] fullcheck_radio = [] fullcheck_upload = [] fullcheck_field = [] fullcheck_level = [] fullcheck_txt = [] fullcheck_noPage = [] fullcheck_check = [] for field_instance in other_form_fields: if field_instance[5] == "M": ## If this field is mandatory, get its description: element_descr = get_element_description(field_instance[3]) try: assert(element_descr is not None) except AssertionError: msg = _("Unknown form field found on submission page.") register_exception(req=req, alert_admin=True, prefix=msg) ## The form field doesn't seem to exist - return with error message: return warningMsg(_("Unknown form field found on submission page."), req, c, ln) if element_descr[3] in ['D', 'R']: if element_descr[3] == "D": text = element_descr[9] else: text = eval(element_descr[9]) formfields = text.split(">") for formfield in formfields: match = re.match("name=([^ <>]+)", formfield, re.IGNORECASE) if match is not None: names = match.groups for value in names: if value != "": value = re.compile("[\"']+").sub("", value) fullcheck_field.append(value) fullcheck_level.append(field_instance[5]) fullcheck_txt.append(field_instance[6]) fullcheck_noPage.append(field_instance[1]) fullcheck_check.append(field_instance[7]) nbFields = nbFields + 1 else: fullcheck_noPage.append(field_instance[1]) fullcheck_field.append(field_instance[3]) fullcheck_level.append(field_instance[5]) fullcheck_txt.append(field_instance[6]) fullcheck_check.append(field_instance[7]) nbFields = nbFields+1 # tests each mandatory field fld = 0 res = 1 for i in xrange(nbFields): res = 1 if not os.path.exists(os.path.join(curdir, fullcheck_field[i])): res=0 else: file = open(os.path.join(curdir, fullcheck_field[i]), "r") text = file.read() if text == '': res=0 else: if text == "Select:": res=0 if res == 0: fld = i break if not res: returnto = { 'field' : fullcheck_txt[fld], 'page' : fullcheck_noPage[fld], } t += websubmit_templates.tmpl_page_interface( ln = ln, docname = docname, actname = actname, curpage = curpage, nbpages = nbpages, nextPg = nextPg, access = access, nbPg = nbPg, doctype = doctype, act = act, fields = full_fields, javascript = websubmit_templates.tmpl_page_interface_js( ln = ln, upload = upload, field = field, fieldhtml = fieldhtml, txt = txt, check = check, level = level, curdir = curdir, values = values, select = select, radio = radio, curpage = curpage, nbpages = nbpages, returnto = returnto, ), mainmenu = mainmenu, ) t += websubmit_templates.tmpl_page_do_not_leave_submission_js(ln) # start display: req.content_type = "text/html" req.send_http_header() p_navtrail = """<a href="/submit?ln=%(ln)s" class="navtrail">%(submit)s</a> > <a href="/submit?doctype=%(doctype)s&ln=%(ln)s" class="navtrail">%(docname)s</a> """ % { 'submit' : _("Submit"), 'doctype' : quote_plus(doctype), 'docname' : docname, 'ln' : ln } return page(title= actname, body = t, navtrail = p_navtrail, description = "submit documents", keywords = "submit", uid = uid, language = ln, req = req, navmenuid='submit') def endaction(req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG, doctype="", act="", startPg=1, access="", mainmenu="", fromdir="", nextPg="", nbPg="", curpage=1, step=1, mode="U"): """Having filled-in the WebSubmit form created for metadata by the interface function, the user clicks a button to either "finish the submission" or to "proceed" to the next stage of the submission. At this point, a variable called "step" will be given a value of 1 or above, which means that this function is called by websubmit_webinterface. So, during all non-zero steps of the submission, this function is called. In other words, this function is called during the BACK-END phase of a submission, in which WebSubmit *functions* are being called. The function first ensures that all of the WebSubmit form field values have been saved in the current working submission directory, in text- files with the same name as the field elements have. It then determines the functions to be called for the given step of the submission, and executes them. Following this, if this is the last step of the submission, it logs the submission as "finished" in the journal of submissions. @param req: (apache request object) *** NOTE: Added into this object, is a variable called "form" (req.form). This is added into the object in the index function of websubmit_webinterface. It contains a "mod_python.util.FieldStorage" instance, that contains the form-fields found on the previous submission page. @param c: (string), defaulted to CFG_SITE_NAME. The name of the CDS Invenio installation. @param ln: (string), defaulted to CFG_SITE_LANG. The language in which to display the pages. @param doctype: (string) - the doctype ID of the doctype for which the submission is being made. @param act: (string) - The ID of the action being performed (e.g. submission of bibliographic information; modification of bibliographic information, etc). @param startPg: (integer) - Starting page for the submission? Defaults to 1. @param indir: (string) - the directory used to store all submissions of the given "type" of this submission. For example, if the submission is of the type "modify bibliographic information", this variable would contain "modify". @param access: (string) - the "access" number for the submission (e.g. 1174062451_7010). This number is also used as the name for the current working submission directory. @param mainmenu: (string) - contains the URL (minus the CDS Invenio home stem) for the submission's home-page. (E.g. If this submission is "PICT", the "mainmenu" file would contain "/submit?doctype=PICT". @param fromdir: @param nextPg: @param nbPg: @param curpage: (integer) - the current submission page number. Defaults to 1. @param step: (integer) - the current step of the submission. Defaults to 1. @param mode: """ global rn, sysno, dismode, curdir, uid, uid_email, last_step, action_score # load the right message language _ = gettext_set_language(ln) try: rn except NameError: rn = "" dismode = mode ln = wash_language(ln) sys.stdout = req t = "" # get user ID: uid = getUid(req) uid_email = get_email(uid) # Preliminary tasks # check that the user is logged in if uid_email == "" or uid_email == "guest": return warningMsg(websubmit_templates.tmpl_warning_message( ln = ln, msg = _("Sorry, you must log in to perform this action.") ), req, ln) ## check we have minimum fields if not doctype or not act or not access: ## We don't have all the necessary information to go ahead ## with this submission: return warningMsg(_("Not enough information to go ahead with the submission."), req, c, ln) if doctype and act: ## Let's clean the input details = get_details_of_submission(doctype, act) if not details: return warningMsg(_("Invalid doctype and act parameters"), req, c, ln) doctype = details[0] act = details[1] try: assert(not access or re.match('\d+_\d+', access)) except AssertionError: register_exception(req=req, prefix='doctype="%s", access="%s"' % (doctype, access)) return warningMsg(_("Invalid parameters"), req, c, ln) ## Before continuing to process the submitted data, verify that ## this submission has not already been completed: if submission_is_finished(doctype, act, access, uid_email): ## This submission has already been completed. ## This situation can arise when, having completed a submission, ## the user uses the browser's back-button to go back to the form ## stage of the submission and then tries to submit once more. ## This is unsafe and should not be allowed. Instead of re-processing ## the submitted data, display an error message to the user: wrnmsg = """<b>This submission has been completed. Please go to the""" \ """ <a href="/submit?doctype=%(doctype)s&ln=%(ln)s">""" \ """main menu</a> to start a new submission.</b>""" \ % { 'doctype' : quote_plus(doctype), 'ln' : ln } return warningMsg(wrnmsg, req) ## retrieve the action and doctype data ## Get the submission storage directory from the DB: submission_dir = get_storage_directory_of_action(act) if submission_dir: indir = submission_dir else: ## Unable to determine the submission-directory: return warningMsg(_("Unable to find the submission directory for the action: %s") % escape(str(act)), req, c, ln) # The following words are reserved and should not be used as field names reserved_words = ["stop", "file", "nextPg", "startPg", "access", "curpage", "nbPg", "act", \ "indir", "doctype", "mode", "step", "deleted", "file_path", "userfile_name"] # This defines the path to the directory containing the action data curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, indir, doctype, access) try: assert(curdir == os.path.abspath(curdir)) except AssertionError: register_exception(req=req, prefix='indir="%s", doctype=%s, access=%s' % (indir, doctype, access)) return warningMsg(_("Invalid parameters"), req, c, ln) ## If the submission directory still does not exist, we create it if not os.path.exists(curdir): try: os.makedirs(curdir) except Exception, e: register_exception(req=req, alert_admin=True) return warningMsg(_("Unable to create a directory for this submission. The administrator has been alerted."), req, c, ln) # retrieve the original main menu url ans save it in the "mainmenu" file if mainmenu != "": fp = open(os.path.join(curdir, "mainmenu"), "w") fp.write(mainmenu) fp.close() # and if the file containing the URL to the main menu exists # we retrieve it and store it in the $mainmenu variable if os.path.exists(os.path.join(curdir, "mainmenu")): fp = open(os.path.join(curdir, "mainmenu"), "r"); mainmenu = fp.read() fp.close() else: mainmenu = "%s/submit" % (CFG_SITE_URL,) ## retrieve the name of the file in which the reference of ## the submitted document will be stored rn_filename = get_parameter_value_for_doctype(doctype, "edsrn") if rn_filename is not None: edsrn = rn_filename else: ## Unknown value for edsrn - set it to an empty string: edsrn = "" ## Determine whether the action is finished ## (ie there are no other steps after the current one): finished = function_step_is_last(doctype, act, step) ## Let's write in curdir file under curdir the curdir value ## in case e.g. it is needed in FFT. fp = open(os.path.join(curdir, "curdir"), "w") fp.write(curdir) fp.close() # Save the form fields entered in the previous submission page # If the form was sent with the GET method form = req.form value = "" # we parse all the form variables for key in form.keys(): formfields = form[key] filename = key.replace("[]", "") file_to_open = os.path.join(curdir, filename) try: assert(file_to_open == os.path.abspath(file_to_open)) assert(file_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR)) except AssertionError: register_exception(req=req, prefix='curdir="%s", filename="%s"' % (curdir, filename)) return warningMsg(_("Invalid parameters"), req, c, ln) # the field is an array if isinstance(formfields,types.ListType): fp = open(file_to_open, "w") for formfield in formfields: #stripslashes(value) value = specialchars(formfield) fp.write(value+"\n") fp.close() # the field is a normal string elif isinstance(formfields, types.StringTypes) and formfields != "": value = formfields fp = open(file_to_open, "w") fp.write(specialchars(value)) fp.close() # the field is a file elif hasattr(formfields, "filename") and formfields.filename: dir_to_open = os.path.join(curdir, 'files', key) try: assert(dir_to_open == os.path.abspath(dir_to_open)) assert(dir_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR)) except AssertionError: register_exception(req=req, prefix='curdir="%s", key="%s"' % (curdir, key)) return warningMsg(_("Invalid parameters"), req, c, ln) if not os.path.exists(dir_to_open): try: os.makedirs(dir_to_open) except: register_exception(req=req, alert_admin=True) return warningMsg(_("Cannot create submission directory. The administrator has been alerted."), req, c, ln) filename = formfields.filename ## Before saving the file to disc, wash the filename (in particular ## washing away UNIX and Windows (e.g. DFS) paths): filename = os.path.basename(filename.split('\\')[-1]) filename = filename.strip() if filename != "": # This may be dangerous if the file size is bigger than the available memory data = formfields.file.read() fp = open(os.path.join(dir_to_open, filename), "w") fp.write(data) fp.close() fp = open(os.path.join(curdir, "lastuploadedfile"), "w") fp.write(filename) fp.close() fp = open(file_to_open, "w") fp.write(filename) fp.close() else: return warningMsg(_("No file uploaded?"), req, c, ln) ## if the found field is the reference of the document ## we save this value in the "journal of submissions" if uid_email != "" and uid_email != "guest": if key == edsrn: update_submission_reference_in_log(doctype, access, uid_email, value) ## get the document type's long-name: doctype_lname = get_longname_of_doctype(doctype) if doctype_lname is not None: ## Got the doctype long-name: replace spaces with HTML chars: docname = doctype_lname.replace(" ", " ") else: ## Unknown document type: return warningMsg(_("Unknown document type"), req, c, ln) ## get the action's long-name: actname = get_longname_of_action(act) if actname is None: ## Unknown action: return warningMsg(_("Unknown action"), req, c, ln) ## Get the number of pages for this submission: subname = "%s%s" % (act, doctype) num_submission_pages = get_num_pages_of_submission(subname) if num_submission_pages is not None: nbpages = num_submission_pages else: ## Unable to determine the number of pages for this submission: return warningMsg(_("Unable to determine the number of submission pages."), \ req, CFG_SITE_NAME, ln) ## Determine whether the action is finished ## (ie there are no other steps after the current one): last_step = function_step_is_last(doctype, act, step) next_action = '' ## The next action to be proposed to the user # Prints the action details, returning the mandatory score action_score = action_details(doctype, act) current_level = get_level(doctype, act) # Calls all the function's actions function_content = '' try: ## Handle the execution of the functions for this ## submission/step: start_time = time.time() function_content = print_function_calls(req=req, doctype=doctype, action=act, step=step, form=form, start_time=start_time, ln=ln) except InvenioWebSubmitFunctionError, e: register_exception(req=req, alert_admin=True, prefix='doctype="%s", action="%s", step="%s", form="%s", start_time="%s"' % (doctype, act, step, form, start_time)) ## There was a serious function-error. Execution ends. return warningMsg(_("A serious function-error has been encountered. Adminstrators have been alerted. <br /><em>Please not that this might be due to wrong characters inserted into the form</em> (e.g. by copy and pasting some text from a PDF file)."), req, c, ln) except InvenioWebSubmitFunctionStop, e: ## For one reason or another, one of the functions has determined that ## the data-processing phase (i.e. the functions execution) should be ## halted and the user should be returned to the form interface once ## more. (NOTE: Redirecting the user to the Web-form interface is ## currently done using JavaScript. The "InvenioWebSubmitFunctionStop" ## exception contains a "value" string, which is effectively JavaScript ## - probably an alert box and a form that is submitted). **THIS WILL ## CHANGE IN THE FUTURE WHEN JavaScript IS REMOVED!** if e.value is not None: function_content = e.value else: function_content = e else: ## No function exceptions (InvenioWebSubmitFunctionStop, ## InvenioWebSubmitFunctionError) were raised by the functions. Propose ## the next action (if applicable), and log the submission as finished: ## If the action was mandatory we propose the next ## mandatory action (if any) if action_score != -1 and last_step == 1: next_action = Propose_Next_Action(doctype, \ action_score, \ access, \ current_level, \ indir) ## If we are in the last step of an action, we can update ## the "journal of submissions" if last_step == 1: if uid_email != "" and uid_email != "guest" and rn != "": ## update the "journal of submission": ## Does the submission already exist in the log? submission_exists = \ submission_exists_in_log(doctype, act, access, uid_email) if submission_exists == 1: ## update the rn and status to finished for this submission ## in the log: update_submission_reference_and_status_in_log(doctype, \ act, \ access, \ uid_email, \ rn, \ "finished") else: ## Submission doesn't exist in log - create it: log_new_completed_submission(doctype, \ act, \ access, \ uid_email, \ rn) ## Having executed the functions, create the page that will be displayed ## to the user: t = websubmit_templates.tmpl_page_endaction( ln = ln, # these fields are necessary for the navigation nextPg = nextPg, startPg = startPg, access = access, curpage = curpage, nbPg = nbPg, nbpages = nbpages, doctype = doctype, act = act, docname = docname, actname = actname, mainmenu = mainmenu, finished = finished, function_content = function_content, next_action = next_action, ) if not finished: t += websubmit_templates.tmpl_page_do_not_leave_submission_js(ln) # start display: req.content_type = "text/html" req.send_http_header() p_navtrail = '<a href="/submit?ln='+ln+'" class="navtrail">' + _("Submit") +\ """</a> > <a href="/submit?doctype=%(doctype)s&ln=%(ln)s" class="navtrail">%(docname)s</a>""" % { 'doctype' : quote_plus(doctype), 'docname' : docname, 'ln' : ln, } return page(title= actname, body = t, navtrail = p_navtrail, description="submit documents", keywords="submit", uid = uid, language = ln, req = req, navmenuid='submit') def home(req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG): """This function generates the WebSubmit "home page". Basically, this page contains a list of submission-collections in WebSubmit, and gives links to the various document-type submissions. Document-types only appear on this page when they have been connected to a submission-collection in WebSubmit. @param req: (apache request object) @param c: (string) - defaults to CFG_SITE_NAME @param ln: (string) - The CDS Invenio interface language of choice. Defaults to CFG_SITE_LANG (the default language of the installation). @return: (string) - the Web page to be displayed. """ ln = wash_language(ln) # get user ID: try: uid = getUid(req) except Error, e: return errorMsg(e, req, c, ln) # load the right message language _ = gettext_set_language(ln) user_info = collect_user_info(req) finaltext = websubmit_templates.tmpl_submit_home_page( ln = ln, catalogues = makeCataloguesTable(user_info, ln) ) return page(title=_("Submit"), body=finaltext, navtrail=[], description="submit documents", keywords="submit", uid=uid, language=ln, req=req, navmenuid='submit' ) def makeCataloguesTable(user_info, ln=CFG_SITE_LANG): """Build the 'catalogues' (submission-collections) tree for the WebSubmit home-page. This tree contains the links to the various document types in WebSubmit. @param user_info: (dict) - the user information in order to decide whether to display a submission. @param ln: (string) - the language of the interface. (defaults to 'CFG_SITE_LANG'). @return: (string) - the submission-collections tree. """ text = "" catalogues = [] ## Get the submission-collections attached at the top level ## of the submission-collection tree: top_level_collctns = get_collection_children_of_submission_collection(0) if len(top_level_collctns) != 0: ## There are submission-collections attatched to the top level. ## retrieve their details for displaying: for child_collctn in top_level_collctns: catalogues.append(getCatalogueBranch(child_collctn[0], 1, user_info)) text = websubmit_templates.tmpl_submit_home_catalogs( ln=ln, catalogs=catalogues ) else: text = websubmit_templates.tmpl_submit_home_catalog_no_content(ln=ln) return text def getCatalogueBranch(id_father, level, user_info): """Build up a given branch of the submission-collection tree. I.e. given a parent submission-collection ID, build up the tree below it. This tree will include doctype-children, as well as other submission- collections and their children. Finally, return the branch as a dictionary. @param id_father: (integer) - the ID of the submission-collection from which to begin building the branch. @param level: (integer) - the level of the current submission- collection branch. @param user_info: (dict) - the user information in order to decide whether to display a submission. @return: (dictionary) - the branch and its sub-branches. """ elem = {} ## The dictionary to contain this branch of the tree. ## First, get the submission-collection-details: collctn_name = get_submission_collection_name(id_father) if collctn_name is not None: ## Got the submission-collection's name: elem['name'] = collctn_name else: ## The submission-collection is unknown to the DB ## set its name as empty: elem['name'] = "" elem['id'] = id_father elem['level'] = level ## Now get details of the doctype-children of this ## submission-collection: elem['docs'] = [] ## List to hold the doctype-children ## of the submission-collection doctype_children = \ get_doctype_children_of_submission_collection(id_father) for child_doctype in doctype_children: if acc_authorize_action(user_info, 'submit', authorized_if_no_roles=True, doctype=child_doctype[0])[0] == 0: elem['docs'].append(getDoctypeBranch(child_doctype[0])) ## Now, get the collection-children of this submission-collection: elem['sons'] = [] collctn_children = \ get_collection_children_of_submission_collection(id_father) for child_collctn in collctn_children: elem['sons'].append(getCatalogueBranch(child_collctn[0], level + 1), user_info) ## Now return this branch of the built-up 'collection-tree': return elem def getDoctypeBranch(doctype): """Create a document-type 'leaf-node' for the submission-collections tree. Basically, this leaf is a dictionary containing the name and ID of the document-type submission to which it links. @param doctype: (string) - the ID of the document type. @return: (dictionary) - the document-type 'leaf node'. Contains the following values: + id: (string) - the document-type ID. + name: (string) - the (long) name of the document-type. """ ldocname = get_longname_of_doctype(doctype) if ldocname is None: ldocname = "Unknown Document Type" return { 'id' : doctype, 'name' : ldocname, } def displayCatalogueBranch(id_father, level, catalogues): text = "" collctn_name = get_submission_collection_name(id_father) if collctn_name is None: ## If this submission-collection wasn't known in the DB, ## give it the name "Unknown Submission-Collection" to ## avoid errors: collctn_name = "Unknown Submission-Collection" ## Now, create the display for this submission-collection: if level == 1: text = "<LI><font size=\"+1\"><strong>%s</strong></font>\n" \ % collctn_name else: ## TODO: These are the same (and the if is ugly.) Why? if level == 2: text = "<LI>%s\n" % collctn_name else: if level > 2: text = "<LI>%s\n" % collctn_name ## Now display the children document-types that are attached ## to this submission-collection: ## First, get the children: doctype_children = get_doctype_children_of_submission_collection(id_father) collctn_children = get_collection_children_of_submission_collection(id_father) if len(doctype_children) > 0 or len(collctn_children) > 0: ## There is something to display, so open a list: text = text + "<UL>\n" ## First, add the doctype leaves of this branch: for child_doctype in doctype_children: ## Add the doctype 'leaf-node': text = text + displayDoctypeBranch(child_doctype[0], catalogues) ## Now add the submission-collection sub-branches: for child_collctn in collctn_children: catalogues.append(child_collctn[0]) text = text + displayCatalogueBranch(child_collctn[0], level+1, catalogues) ## Finally, close up the list if there were nodes to display ## at this branch: if len(doctype_children) > 0 or len(collctn_children) > 0: text = text + "</UL>\n" return text def displayDoctypeBranch(doctype, catalogues): text = "" ldocname = get_longname_of_doctype(doctype) if ldocname is None: ldocname = "Unknown Document Type" text = "<LI><a href=\"\" onmouseover=\"javascript:" \ "popUpTextWindow('%s',true,event);\" onmouseout" \ "=\"javascript:popUpTextWindow('%s',false,event);\" " \ "onClick=\"document.forms[0].doctype.value='%s';" \ "document.forms[0].submit();return false;\">%s</a>\n" \ % (doctype, doctype, doctype, ldocname) return text def action(req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG, doctype=""): # load the right message language _ = gettext_set_language(ln) nbCateg = 0 snameCateg = [] lnameCateg = [] actionShortDesc = [] indir = [] actionbutton = [] statustext = [] t = "" ln = wash_language(ln) # get user ID: try: uid = getUid(req) uid_email = get_email(uid) except Error, e: return errorMsg(e, req, c, ln) #parses database to get all data ## first, get the list of categories doctype_categs = get_categories_of_doctype(doctype) for doctype_categ in doctype_categs: nbCateg = nbCateg+1 snameCateg.append(doctype_categ[0]) lnameCateg.append(doctype_categ[1]) ## Now get the details of the document type: doctype_details = get_doctype_details(doctype) if doctype_details is None: ## Doctype doesn't exist - raise error: return warningMsg(_("Unable to find document type: %s") % escape(str(doctype)), req, c, ln) else: docFullDesc = doctype_details[0] # Also update the doctype as returned by the database, since # it might have a differnent case (eg. DemOJrN->demoJRN) doctype = docShortDesc = doctype_details[1] description = doctype_details[4] ## Get the details of the actions supported by this document-type: doctype_actions = get_actions_on_submission_page_for_doctype(doctype) for doctype_action in doctype_actions: ## Get the details of this action: action_details = get_action_details(doctype_action[0]) if action_details is not None: actionShortDesc.append(doctype_action[0]) indir.append(action_details[1]) actionbutton.append(action_details[4]) statustext.append(action_details[5]) ## Send the gathered information to the template so that the doctype's ## home-page can be displayed: t = websubmit_templates.tmpl_action_page( ln=ln, uid=uid, guest=(uid_email == "" or uid_email == "guest"), pid = os.getpid(), now = time.time(), doctype = doctype, description = description, docfulldesc = docFullDesc, snameCateg = snameCateg, lnameCateg = lnameCateg, actionShortDesc = actionShortDesc, indir = indir, # actionbutton = actionbutton, statustext = statustext, ) p_navtrail = """<a href="/submit?ln=%(ln)s" class="navtrail">%(submit)s</a>""" % {'submit' : _("Submit"), 'ln' : ln} return page(title = docFullDesc, body=t, navtrail=p_navtrail, description="submit documents", keywords="submit", uid=uid, language=ln, req=req, navmenuid='submit' ) def Request_Print(m, txt): """The argumemts to this function are the display mode (m) and the text to be displayed (txt). If the argument mode is 'ALL' then the text is unconditionally echoed m can also take values S (Supervisor Mode) and U (User Mode). In these circumstances txt is only echoed if the argument mode is the same as the current mode """ global dismode if m == "A" or m == dismode: return txt else: return "" def Evaluate_Parameter (field, doctype): # Returns the literal value of the parameter. Assumes that the value is # uniquely determined by the doctype, i.e. doctype is the primary key in # the table # If the table name is not null, evaluate the parameter ## TODO: The above comment looks like nonesense? This ## function only seems to get the values of parameters ## from the db... ## Get the value for the parameter: param_val = get_parameter_value_for_doctype(doctype, field) if param_val is None: ## Couldn't find a value for this parameter for this doctype. ## Instead, try with the default doctype (DEF): param_val = get_parameter_value_for_doctype("DEF", field) if param_val is None: ## There was no value for the parameter for the default doctype. ## Nothing can be done about it - return an empty string: return "" else: ## There was some kind of value for the parameter; return it: return param_val def Get_Parameters (function, doctype): """For a given function of a given document type, a dictionary of the parameter names and values are returned. @param function: (string) - the name of the function for which the parameters are to be retrieved. @param doctype: (string) - the ID of the document type. @return: (dictionary) - of the parameters of the function. Keyed by the parameter name, values are of course the parameter values. """ parray = {} ## Get the names of the parameters expected by this function: func_params = get_parameters_of_function(function) for func_param in func_params: ## For each of the parameters, get its value for this document- ## type and add it into the dictionary of parameters: parameter = func_param[0] parray[parameter] = Evaluate_Parameter (parameter, doctype) return parray def get_level(doctype, action): """Get the level of a given submission. If unknown, return 0 as the level. @param doctype: (string) - the ID of the document type. @param action: (string) - the ID of the action. @return: (integer) - the level of the submission; 0 otherwise. """ subm_details = get_details_of_submission(doctype, action) if subm_details is not None: ## Return the level of this action subm_level = subm_details[9] try: int(subm_level) except ValueError: return 0 else: return subm_level else: return 0 def action_details (doctype, action): # Prints whether the action is mandatory or optional. The score of the # action is returned (-1 if the action was optional) subm_details = get_details_of_submission(doctype, action) if subm_details is not None: if subm_details[9] != "0": ## This action is mandatory; return the score: return subm_details[10] else: return -1 else: return -1 def print_function_calls (req, doctype, action, step, form, start_time, ln=CFG_SITE_LANG): # Calls the functions required by an "action" action on a "doctype" document # In supervisor mode, a table of the function calls is produced global htdocsdir,CFG_WEBSUBMIT_STORAGEDIR,access,CFG_PYLIBDIR,dismode user_info = collect_user_info(req) # load the right message language _ = gettext_set_language(ln) t = "" ## Get the list of functions to be called funcs_to_call = get_functions_for_submission_step(doctype, action, step) ## If no functions are found at this step for this doctype, ## get the functions for the DEF(ault) doctype: if len(funcs_to_call) == 0: funcs_to_call = get_functions_for_submission_step("DEF", action, step) if len(funcs_to_call) > 0: # while there are functions left... functions = [] for function in funcs_to_call: function_name = function[0] function_score = function[1] currfunction = { 'name' : function_name, 'score' : function_score, 'error' : 0, 'text' : '', } if os.path.exists("%s/invenio/websubmit_functions/%s.py" % (CFG_PYLIBDIR, function_name)): # import the function itself #function = getattr(invenio.websubmit_functions, function_name) execfile("%s/invenio/websubmit_functions/%s.py" % (CFG_PYLIBDIR, function_name), globals()) if not globals().has_key(function_name): currfunction['error'] = 1 else: function = globals()[function_name] # Evaluate the parameters, and place them in an array parameters = Get_Parameters(function_name, doctype) # Call function: log_function(curdir, "Start %s" % function_name, start_time) try: try: ## Attempt to call the function with 4 arguments: ## ("parameters", "curdir" and "form" as usual), ## and "user_info" - the dictionary of user ## information: ## ## Note: The function should always be called with ## these keyword arguments because the "TypeError" ## except clause checks for a specific mention of ## the 'user_info' keyword argument when a legacy ## function (one that accepts only 'parameters', ## 'curdir' and 'form') has been called and if ## the error string doesn't contain this, ## the TypeError will be considered as a something ## that was incorrectly handled in the function and ## will be propagated as an ## InvenioWebSubmitFunctionError instead of the ## function being called again with the legacy 3 ## arguments. func_returnval = function(parameters=parameters, \ curdir=curdir, \ form=form, \ user_info=user_info) except TypeError, err: ## If the error contains the string "got an ## unexpected keyword argument", it means that the ## function doesn't accept the "user_info" ## argument. Test for this: if "got an unexpected keyword argument 'user_info'" in \ str(err).lower(): ## As expected, the function doesn't accept ## the user_info keyword argument. Call it ## again with the legacy 3 arguments ## (parameters, curdir, form): func_returnval = \ function(parameters=parameters, \ curdir=curdir, \ form=form) else: ## An unexpected "TypeError" was caught. ## It looks as though the function itself didn't ## handle something correctly. ## Convert this error into an ## InvenioWebSubmitFunctionError and raise it: msg = "Unhandled TypeError caught when " \ "calling [%s] WebSubmit function: " \ "[%s]" % (function_name, str(err)) raise InvenioWebSubmitFunctionError(msg) except InvenioWebSubmitFunctionWarning, err: ## There was an unexpected behaviour during the ## execution. Log the message into function's log ## and go to next function log_function(curdir, "***Warning*** from %s: %s" \ % (function_name, str(err)), start_time) ## Reset "func_returnval" to None: func_returnval = None log_function(curdir, "End %s" % function_name, start_time) if func_returnval is not None: ## Append the returned value as a string: currfunction['text'] = str(func_returnval) else: ## The function the NoneType. Don't keep that value as ## the currfunction->text. Replace it with the empty ## string. currfunction['text'] = "" else: currfunction['error'] = 1 functions.append(currfunction) t = websubmit_templates.tmpl_function_output( ln = ln, display_on = (dismode == 'S'), action = action, doctype = doctype, step = step, functions = functions, ) else : if dismode == 'S': t = "<br /><br /><b>" + _("The chosen action is not supported by the document type.") + "</b>" return t def Propose_Next_Action (doctype, action_score, access, currentlevel, indir, ln=CFG_SITE_LANG): global machine, CFG_WEBSUBMIT_STORAGEDIR, act, rn t = "" next_submissions = \ get_submissions_at_level_X_with_score_above_N(doctype, currentlevel, action_score) if len(next_submissions) > 0: actions = [] first_score = next_submissions[0][10] for action in next_submissions: if action[10] == first_score: ## Get the submission directory of this action: nextdir = get_storage_directory_of_action(action[1]) if nextdir is None: nextdir = "" curraction = { 'page' : action[11], 'action' : action[1], 'doctype' : doctype, 'nextdir' : nextdir, 'access' : access, 'indir' : indir, 'name' : action[12], } actions.append(curraction) t = websubmit_templates.tmpl_next_action( ln = ln, actions = actions, ) return t def specialchars(text): text = string.replace(text, "“", "\042"); text = string.replace(text, "”", "\042"); text = string.replace(text, "’", "\047"); text = string.replace(text, "—", "\055"); text = string.replace(text, "…", "\056\056\056"); return text def log_function(curdir, message, start_time, filename="function_log"): """Write into file the message and the difference of time between starttime and current time @param curdir:(string) path to the destination dir @param message: (string) message to write into the file @param starttime: (float) time to compute from @param filname: (string) name of log file """ time_lap = "%.3f" % (time.time() - start_time) if os.access(curdir, os.F_OK|os.W_OK): fd = open("%s/%s" % (curdir, filename), "a+") fd.write("""%s --- %s\n""" % (message, time_lap)) fd.close() ## FIXME: Duplicated def errorMsg(title, req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG): # load the right message language _ = gettext_set_language(ln) return page(title = _("Error"), body = create_error_box(req, title=title, verbose=0, ln=ln), description="%s - Internal Error" % c, keywords="%s, Internal Error" % c, uid = getUid(req), language=ln, req=req, navmenuid='submit') def warningMsg(title, req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG): # load the right message language _ = gettext_set_language(ln) return page(title = _("Warning"), body = title, description="%s - Warning" % c, keywords="%s, Warning" % c, uid = getUid(req), language=ln, req=req, navmenuid='submit') diff --git a/modules/websubmit/lib/websubmit_webinterface.py b/modules/websubmit/lib/websubmit_webinterface.py index f033be704..44ad65392 100644 --- a/modules/websubmit/lib/websubmit_webinterface.py +++ b/modules/websubmit/lib/websubmit_webinterface.py @@ -1,747 +1,743 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. __lastupdated__ = """$Date$""" __revision__ = "$Id$" import os import time import cgi -try: - from mod_python import apache -except ImportError: - pass from urllib import unquote, urlencode from invenio.config import \ CFG_ACCESS_CONTROL_LEVEL_SITE, \ CFG_SITE_LANG, \ CFG_SITE_NAME, \ CFG_SITE_NAME_INTL, \ CFG_SITE_URL, \ CFG_SITE_SECURE_URL, \ CFG_WEBSUBMIT_STORAGEDIR, \ CFG_PREFIX +from invenio import webinterface_handler_wsgi_utils as apache from invenio.dbquery import run_sql from invenio.access_control_config import VIEWRESTRCOLL from invenio.access_control_mailcookie import mail_cookie_create_authorize_action from invenio.access_control_engine import acc_authorize_action from invenio.webpage import page, create_error_box, pageheaderonly, \ pagefooteronly from invenio.webuser import getUid, page_not_authorized, collect_user_info, isGuestUser from invenio.websubmit_config import * from invenio.webinterface_handler import wash_urlargd, WebInterfaceDirectory from invenio.urlutils import make_canonical_urlargd, redirect_to_url from invenio.messages import gettext_set_language from invenio.search_engine import \ guess_primary_collection_of_a_record, \ get_colID, \ create_navtrail_links, check_user_can_view_record from invenio.bibdocfile import BibRecDocs, normalize_format, file_strip_ext, \ stream_restricted_icon, BibDoc, InvenioWebSubmitFileError, stream_file from invenio.errorlib import register_exception from invenio.websubmit_icon_creator import create_icon, InvenioWebSubmitIconCreatorError import invenio.template websubmit_templates = invenio.template.load('websubmit') from invenio.websearchadminlib import get_detailed_page_tabs import invenio.template webstyle_templates = invenio.template.load('webstyle') websearch_templates = invenio.template.load('websearch') try: from invenio.fckeditor_invenio_connector import FCKeditorConnectorInvenio fckeditor_available = True except ImportError, e: fckeditor_available = False class WebInterfaceFilesPages(WebInterfaceDirectory): def __init__(self,recid): self.recid = recid def _lookup(self, component, path): # after /record/<recid>/files/ every part is used as the file # name filename = unquote(component) def getfile(req, form): args = wash_urlargd(form, websubmit_templates.files_default_urlargd) ln = args['ln'] _ = gettext_set_language(ln) uid = getUid(req) user_info = collect_user_info(req) verbose = args['verbose'] if verbose >= 1 and acc_authorize_action(user_info, 'fulltext')[0] != 0: # Only SuperUser can see all the details! verbose = 0 if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE > 1: return page_not_authorized(req, "/record/%s" % self.recid, navmenuid='submit') (auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid) if auth_code and user_info['email'] == 'guest' and not user_info['apache_user']: cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)}) target = '/youraccount/login' + \ make_canonical_urlargd({'action': cookie, 'ln' : ln, 'referer' : \ CFG_SITE_URL + user_info['uri']}, {}) return redirect_to_url(req, target) elif auth_code: return page_not_authorized(req, "../", \ text = auth_msg) readonly = CFG_ACCESS_CONTROL_LEVEL_SITE == 1 # From now on: either the user provided a specific file # name (and a possible version), or we return a list of # all the available files. In no case are the docids # visible. try: bibarchive = BibRecDocs(self.recid) except InvenioWebSubmitFileError, e: register_exception(req=req, alert_admin=True) msg = "<p>%s</p><p>%s</p>" % ( _("The system has encountered an error in retrieving the list of files for this document."), _("The error has been logged and will be taken in consideration as soon as possible.")) return print_warning(msg) docname = '' format = '' version = '' if filename: # We know the complete file name, guess which docid it # refers to ## TODO: Change the extension system according to ext.py from setlink ## and have a uniform extension mechanism... docname = file_strip_ext(filename) format = filename[len(docname):] if format and format[0] != '.': format = '.' + format else: docname = args['docname'] if not format: format = args['format'] if not version: version = args['version'] # version could be either empty, or all or an integer try: int(version) except ValueError: if version != 'all': version = '' display_hidden = acc_authorize_action(user_info, 'fulltext')[0] == 0 if version != 'all': # search this filename in the complete list of files for doc in bibarchive.list_bibdocs(): if docname == doc.get_docname(): try: docfile = doc.get_file(format, version) except InvenioWebSubmitFileError, msg: register_exception(req=req, alert_admin=True) if docfile.get_status() == '': # The file is not resticted, let's check for # collection restriction then. (auth_code, auth_message) = check_user_can_view_record(user_info, self.recid) if auth_code: return warningMsg(_("The collection to which this file belong is restricted: ") + auth_message, req, CFG_SITE_NAME, ln) else: # The file is probably restricted on its own. # Let's check for proper authorization then (auth_code, auth_message) = docfile.is_restricted(req) if auth_code != 0: return warningMsg(_("This file is restricted: ") + auth_message, req, CFG_SITE_NAME, ln) if display_hidden or not docfile.hidden_p(): if not readonly: - ip = str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + ip = str(req.remote_ip) res = doc.register_download(ip, version, format, uid) try: return docfile.stream(req) except InvenioWebSubmitFileError, msg: register_exception(req=req, alert_admin=True) return warningMsg(_("An error has happened in trying to stream the request file."), req, CFG_SITE_NAME, ln) else: warn = print_warning(_("The requested file is hidden and you don't have the proper rights to access it.")) elif doc.get_icon() is not None and doc.get_icon().docname == file_strip_ext(filename): icon = doc.get_icon() try: iconfile = icon.get_file(format, version) except InvenioWebSubmitFileError, msg: register_exception(req=req, alert_admin=True) return warningMsg(_("An error has happened in trying to retrieve the corresponding icon."), req, CFG_SITE_NAME, ln) if iconfile.get_status() == '': # The file is not resticted, let's check for # collection restriction then. (auth_code, auth_message) = check_user_can_view_record(user_info, self.recid) if auth_code: return stream_restricted_icon(req) else: # The file is probably restricted on its own. # Let's check for proper authorization then (auth_code, auth_message) = iconfile.is_restricted(req) if auth_code != 0: return stream_restricted_icon(req) if not readonly: - ip = str(req.get_remote_host(apache.REMOTE_NOLOOKUP)) + ip = str(req.remote_ip) res = doc.register_download(ip, version, format, uid) try: return iconfile.stream(req) except InvenioWebSubmitFileError, msg: register_exception(req=req, alert_admin=True) return warningMsg(_("An error has happened in trying to stream the corresponding icon."), req, CFG_SITE_NAME, ln) if docname and format and display_hidden: req.status = apache.HTTP_NOT_FOUND warn = print_warning(_("Requested file does not seem to exist.")) else: warn = '' filelist = bibarchive.display("", version, ln=ln, verbose=verbose, display_hidden=display_hidden) t = warn + websubmit_templates.tmpl_filelist( ln=ln, recid=self.recid, docname=args['docname'], version=version, filelist=filelist) cc = guess_primary_collection_of_a_record(self.recid) unordered_tabs = get_detailed_page_tabs(get_colID(cc), self.recid, ln) ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()] ordered_tabs_id.sort(lambda x,y: cmp(x[1],y[1])) link_ln = '' if ln != CFG_SITE_LANG: link_ln = '?ln=%s' % ln tabs = [(unordered_tabs[tab_id]['label'], \ '%s/record/%s/%s%s' % (CFG_SITE_URL, self.recid, tab_id, link_ln), \ tab_id == 'files', unordered_tabs[tab_id]['enabled']) \ for (tab_id, order) in ordered_tabs_id if unordered_tabs[tab_id]['visible'] == True] top = webstyle_templates.detailed_record_container_top(self.recid, tabs, args['ln']) bottom = webstyle_templates.detailed_record_container_bottom(self.recid, tabs, args['ln']) title, description, keywords = websearch_templates.tmpl_record_page_header_content(req, self.recid, args['ln']) return pageheaderonly(title=title, navtrail=create_navtrail_links(cc=cc, aas=0, ln=ln) + \ ''' > <a class="navtrail" href="%s/record/%s">%s</a> > %s''' % \ (CFG_SITE_URL, self.recid, title, _("Access to Fulltext")), description="", keywords="keywords", uid=uid, language=ln, req=req, navmenuid='search', navtrail_append_title_p=0) + \ websearch_templates.tmpl_search_pagestart(ln) + \ top + t + bottom + \ websearch_templates.tmpl_search_pageend(ln) + \ pagefooteronly(lastupdated=__lastupdated__, language=ln, req=req) return getfile, [] def __call__(self, req, form): """Called in case of URLs like /record/123/files without trailing slash. """ args = wash_urlargd(form, websubmit_templates.files_default_urlargd) ln = args['ln'] link_ln = '' if ln != CFG_SITE_LANG: link_ln = '?ln=%s' % ln return redirect_to_url(req, '%s/record/%s/files/%s' % (CFG_SITE_URL, self.recid, link_ln)) def websubmit_legacy_getfile(req, form): """ Handle legacy /getfile.py URLs """ args = wash_urlargd(form, { 'recid': (int, 0), 'docid': (int, 0), 'version': (str, ''), 'name': (str, ''), 'format': (str, ''), 'ln' : (str, CFG_SITE_LANG) }) _ = gettext_set_language(args['ln']) def _getfile_py(req, recid=0, docid=0, version="", name="", format="", ln=CFG_SITE_LANG): if not recid: ## Let's obtain the recid from the docid if docid: try: bibdoc = BibDoc(docid=docid) recid = bibdoc.get_recid() except InvenioWebSubmitFileError, e: return warningMsg(_("An error has happened in trying to retrieve the requested file."), req, CFG_SITE_NAME, ln) else: return warningMsg(_('Not enough information to retrieve the document'), req, CFG_SITE_NAME, ln) else: if not name and docid: ## Let's obtain the name from the docid try: bibdoc = BibDoc(docid) name = bibdoc.get_docname() except InvenioWebSubmitFileError, e: return warningMsg(_("An error has happened in trying to retrieving the requested file."), req, CFG_SITE_NAME, ln) format = normalize_format(format) redirect_to_url(req, '%s/record/%s/files/%s%s?ln=%s%s' % (CFG_SITE_URL, recid, name, format, ln, version and 'version=%s' % version or ''), apache.HTTP_MOVED_PERMANENTLY) return _getfile_py(req, **args) # -------------------------------------------------- from invenio.websubmit_engine import home, action, interface, endaction class WebInterfaceSubmitPages(WebInterfaceDirectory): _exports = ['summary', 'sub', 'direct', '', 'attachfile'] def attachfile(self, req, form): """ Process requests received from FCKeditor to upload files. If the uploaded file is an image, create an icon version """ if not fckeditor_available: return apache.HTTP_NOT_FOUND if not form.has_key('NewFile') or \ not form.get('type', None) in \ ['File', 'Image', 'Flash', 'Media']: return apache.HTTP_NOT_FOUND uid = getUid(req) # URL where the file can be fetched after upload user_files_path = '%(CFG_SITE_URL)s/submit/getattachedfile/%(uid)s' % \ {'uid': uid, 'CFG_SITE_URL': CFG_SITE_URL} # Path to directory where uploaded files are saved user_files_absolute_path = '%(CFG_PREFIX)s/var/tmp/attachfile/%(uid)s' % \ {'uid': uid, 'CFG_PREFIX': CFG_PREFIX} try: os.makedirs(user_files_absolute_path) except: pass # Create a Connector instance to handle the request conn = FCKeditorConnectorInvenio(form, recid=-1, uid=uid, allowed_commands=['QuickUpload'], allowed_types = ['File', 'Image', 'Flash', 'Media'], user_files_path = user_files_path, user_files_absolute_path = user_files_absolute_path) user_info = collect_user_info(req) (auth_code, auth_msg) = acc_authorize_action(user_info, 'attachsubmissionfile') if user_info['email'] == 'guest' and not user_info['apache_user']: # User is guest: must login prior to upload data = conn.sendUploadResults(1, '', '', 'Please login before uploading file.') elif auth_code: # User cannot submit data = conn.sendUploadResults(1, '', '', 'Sorry, you are not allowed to submit files.') else: # Process the upload and get the response data = conn.doResponse() # At this point, the file has been uploaded. The FCKeditor # submit the image in form['NewFile']. However, the image # might have been renamed in between by the FCK connector on # the server side, by appending (%04d) at the end of the base # name. Retrieve that file uploaded_file_path = os.path.join(user_files_absolute_path, form['type'].lower(), form['NewFile'].filename) uploaded_file_path = retrieve_most_recent_attached_file(uploaded_file_path) uploaded_file_name = os.path.basename(uploaded_file_path) # Create an icon if form.get('type','') == 'Image': try: (icon_path, icon_name) = create_icon( { 'input-file' : uploaded_file_path, 'icon-name' : os.path.splitext(uploaded_file_name)[0], 'icon-file-format' : os.path.splitext(uploaded_file_name)[1][1:] or 'gif', 'multipage-icon' : False, 'multipage-icon-delay' : 100, 'icon-scale' : "300>", # Resize only if width > 300 'verbosity' : 0, }) # Move original file to /original dir, and replace it with icon file original_user_files_absolute_path = os.path.join(user_files_absolute_path, 'image', 'original') if not os.path.exists(original_user_files_absolute_path): # Create /original dir if needed os.mkdir(original_user_files_absolute_path) os.rename(uploaded_file_path, original_user_files_absolute_path + os.sep + uploaded_file_name) os.rename(icon_path + os.sep + icon_name, uploaded_file_path) except InvenioWebSubmitIconCreatorError, e: pass # Transform the headers into something ok for mod_python for header in conn.headers: if not header is None: if header[0] == 'Content-Type': req.content_type = header[1] else: req.headers_out[header[0]] = header[1] # Send our response req.send_http_header() req.write(data) def _lookup(self, component, path): """ This handler is invoked for the dynamic URLs (for getting and putting attachments) Eg: /submit/getattachedfile/41336978/image/myfigure.png /submit/attachfile/41336978/image/myfigure.png """ if component == 'getattachedfile' and len(path) > 2: uid = path[0] # uid of the submitter file_type = path[1] # file, image, flash or media (as # defined by FCKeditor) if file_type in ['file', 'image', 'flash', 'media']: file_name = '/'.join(path[2:]) # the filename def answer_get(req, form): """Accessing files attached to submission.""" form['file'] = file_name form['type'] = file_type form['uid'] = uid return self.getattachedfile(req, form) return answer_get, [] # All other cases: file not found return None, [] def getattachedfile(self, req, form): """ Returns a file uploaded to the submission 'drop box' by the FCKeditor. """ argd = wash_urlargd(form, {'file': (str, None), 'type': (str, None), 'uid': (int, 0)}) # Can user view this record, i.e. can user access its # attachments? uid = getUid(req) user_info = collect_user_info(req) if not argd['file'] is None: # Prepare path to file on disk. Normalize the path so that # ../ and other dangerous components are removed. path = os.path.abspath(CFG_PREFIX + '/var/tmp/attachfile/' + \ '/' + str(argd['uid']) + \ '/' + argd['type'] + '/' + argd['file']) # Check that we are really accessing attachements # directory, for the declared record. if path.startswith(CFG_PREFIX + '/var/tmp/attachfile/') and os.path.exists(path): return stream_file(req, path) # Send error 404 in all other cases return(apache.HTTP_NOT_FOUND) def direct(self, req, form): """Directly redirected to an initialized submission.""" args = wash_urlargd(form, {'sub': (str, ''), 'access' : (str, '')}) sub = args['sub'] access = args['access'] ln = args['ln'] _ = gettext_set_language(ln) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "direct", navmenuid='submit') myQuery = req.args if not sub: return warningMsg(_("Sorry, 'sub' parameter missing..."), req, ln=ln) res = run_sql("SELECT docname,actname FROM sbmIMPLEMENT WHERE subname=%s", (sub,)) if not res: return warningMsg(_("Sorry. Cannot analyse parameter"), req, ln=ln) else: # get document type doctype = res[0][0] # get action name action = res[0][1] # retrieve other parameter values params = dict(form) # find existing access number if not access: # create 'unique' access number pid = os.getpid() now = time.time() access = "%i_%s" % (now,pid) # retrieve 'dir' value res = run_sql ("SELECT dir FROM sbmACTION WHERE sactname=%s", (action,)) dir = res[0][0] mainmenu = req.headers_in.get('referer') params['access'] = access params['act'] = action params['doctype'] = doctype params['startPg'] = '1' params['mainmenu'] = mainmenu params['ln'] = ln params['indir'] = dir url = "%s/submit?%s" % (CFG_SITE_URL, urlencode(params)) redirect_to_url(req, url) def sub(self, req, form): """DEPRECATED: /submit/sub is deprecated now, so raise email to the admin (but allow submission to continue anyway)""" args = wash_urlargd(form, {'password': (str, '')}) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../sub/", navmenuid='submit') try: raise DeprecationWarning, 'submit/sub handler has been used. Please use submit/direct. e.g. "submit/sub?RN=123@SBIFOO" -> "submit/direct?RN=123&sub=SBIFOO"' except DeprecationWarning: register_exception(req=req, alert_admin=True) ln = args['ln'] _ = gettext_set_language(ln) #DEMOBOO_RN=DEMO-BOOK-2008-001&ln=en&password=1223993532.26572%40APPDEMOBOO params = dict(form) password = args['password'] if password: del params['password'] if "@" in password: params['access'], params['sub'] = password.split('@', 1) else: params['sub'] = password else: args = str(req.args).split('@') if len(args) > 1: params = {'sub' : args[-1]} args = '@'.join(args[:-1]) params.update(cgi.parse_qs(args)) else: return warningMsg(_("Sorry, invalid URL..."), req, ln=ln) url = "%s/submit/direct?%s" % (CFG_SITE_URL, urlencode(params, doseq=True)) redirect_to_url(req, url) def summary(self, req, form): args = wash_urlargd(form, { 'doctype': (str, ''), 'act': (str, ''), 'access': (str, ''), 'indir': (str, '')}) uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../summary", navmenuid='submit') t="" curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, args['indir'], args['doctype'], args['access']) try: assert(curdir == os.path.abspath(curdir)) except AssertionError: register_exception(req=req, alert_admin=True, prefix='Possible cracking tentative: indir="%s", doctype="%s", access="%s"' % (args['indir'], args['doctype'], args['access'])) return warningMsg("Invalid parameters") subname = "%s%s" % (args['act'], args['doctype']) res = run_sql("select sdesc,fidesc,pagenb,level from sbmFIELD where subname=%s " "order by pagenb,fieldnb", (subname,)) nbFields = 0 values = [] for arr in res: if arr[0] != "": val = { 'mandatory' : (arr[3] == 'M'), 'value' : '', 'page' : arr[2], 'name' : arr[0], } if os.path.exists(os.path.join(curdir, curdir,arr[1])): fd = open(os.path.join(curdir, arr[1]),"r") value = fd.read() fd.close() value = value.replace("\n"," ") value = value.replace("Select:","") else: value = "" val['value'] = value values.append(val) return websubmit_templates.tmpl_submit_summary( ln = args['ln'], values = values, ) def index(self, req, form): args = wash_urlargd(form, { 'c': (str, CFG_SITE_NAME), 'doctype': (str, ''), 'act': (str, ''), 'startPg': (str, "1"), 'access': (str, ''), 'mainmenu': (str, ''), 'fromdir': (str, ''), 'nextPg': (str, ''), 'nbPg': (str, ''), 'curpage': (str, '1'), 'step': (str, '0'), 'mode': (str, 'U'), }) - req.form = form ## Strip whitespace from beginning and end of doctype and action: args["doctype"] = args["doctype"].strip() args["act"] = args["act"].strip() def _index(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage, step, mode): uid = getUid(req) if isGuestUser(uid): return redirect_to_url(req, "%s/youraccount/login%s" % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({ 'referer' : CFG_SITE_URL + req.unparsed_uri, 'ln' : args['ln']}, {}))) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../submit", navmenuid='submit') if doctype=="": return home(req,c,ln) elif act=="": return action(req,c,ln,doctype) elif int(step)==0: return interface(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage) else: return endaction(req, c, ln, doctype, act, startPg, access,mainmenu, fromdir, nextPg, nbPg, curpage, step, mode) return _index(req, **args) # Answer to both /submit/ and /submit __call__ = index def errorMsg(title, req, c=None, ln=CFG_SITE_LANG): # load the right message language _ = gettext_set_language(ln) if c is None: c = CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME) return page(title = _("Error"), body = create_error_box(req, title=title, verbose=0, ln=ln), description="%s - Internal Error" % c, keywords="%s, Internal Error" % c, uid = getUid(req), language=ln, req=req, navmenuid='submit') def warningMsg(title, req, c=None, ln=CFG_SITE_LANG): # load the right message language _ = gettext_set_language(ln) if c is None: c = CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME) return page(title = _("Warning"), body = title, description="%s - Internal Error" % c, keywords="%s, Internal Error" % c, uid = getUid(req), language=ln, req=req, navmenuid='submit') def print_warning(msg, type='', prologue='<br />', epilogue='<br />'): """Prints warning message and flushes output.""" if msg: return websubmit_templates.tmpl_print_warning( msg = msg, type = type, prologue = prologue, epilogue = epilogue, ) else: return '' def retrieve_most_recent_attached_file(file_path): """ Retrieve the latest file that has been uploaded with the FCKeditor. This is the only way to retrieve files that the FCKeditor has renamed after the upload. Eg: 'prefix/image.jpg' was uploaded but did already exist. FCKeditor silently renamed it to 'prefix/image(1).jpg': >>> retrieve_most_recent_attached_file('prefix/image.jpg') 'prefix/image(1).jpg' """ (base_path, filename) = os.path.split(file_path) base_name = os.path.splitext(filename)[0] file_ext = os.path.splitext(filename)[1][1:] most_recent_filename = filename i = 0 while True: i += 1 possible_filename = "%s(%d).%s" % \ (base_name, i, file_ext) if os.path.exists(base_path + os.sep + possible_filename): most_recent_filename = possible_filename else: break return os.path.join(base_path, most_recent_filename) diff --git a/modules/websubmit/web/admin/websubmitadmin.py b/modules/websubmit/web/admin/websubmitadmin.py index eca68dbe5..1c1e33c6c 100644 --- a/modules/websubmit/web/admin/websubmitadmin.py +++ b/modules/websubmit/web/admin/websubmitadmin.py @@ -1,969 +1,968 @@ # -*- coding: utf-8 -*- ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. __revision__ = "$Id$" __lastupdated__ = """$Date$""" import sys -from mod_python import apache from invenio.websubmitadmin_engine import * from invenio.config import CFG_SITE_LANG from invenio.webuser import getUid, page_not_authorized from invenio.webpage import page from invenio.messages import wash_language, gettext_set_language def index(req, ln=CFG_SITE_LANG): """Websubmit Admin home page. Default action: list all WebSubmit document types.""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_list_doctypes() return page(title = "Available WebSubmit Document Types", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def showall(req, ln=CFG_SITE_LANG): """Placeholder for the showall functionality""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_list_doctypes() return page(title = "Available WebSubmit Document Types", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypelist(req, ln=CFG_SITE_LANG): """List all WebSubmit document types.""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_list_doctypes() return page(title = "Available WebSubmit Document Types", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def jschecklist(req, ln=CFG_SITE_LANG): """List all WebSubmit JavaScript Checks (checks can be applied to form elements in WebSubmit.)""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_list_jschecks() return page(title = "Available WebSubmit Checking Functions", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def actionlist(req, ln=CFG_SITE_LANG): """List all WebSubmit actions.""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_list_actions() return page(title = "Available WebSubmit Actions", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def functionlist(req, ln=CFG_SITE_LANG): """List all WebSubmit FUNCTIONS (Functions do the work of processing a submission)""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_list_functions() return page(title = "Available WebSubmit Functions", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def elementlist(req, ln=CFG_SITE_LANG): """List all WebSubmit form ELEMENTS (elements are input fields on a WebSubmit form)""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_list_elements() return page(title = "Available WebSubmit Elements", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def actionadd(req, actid=None, actname=None, working_dir=None, status_text=None, actcommit="", ln=CFG_SITE_LANG): """Add a new action to the WebSubmit database. Web form for action details will be displayed if "actid" and "actname" are empty; else new action will be committed to websubmit. @param actid: unique id for new action (if empty, Web form will be displayed) @param actname: name of new action (if empty, Web form will be displayed) @param working_dir: action working directory for WebSubmit @param status_text: status text displayed at end of WebSubmit action @param ln: language @return: page """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_add_action(actid, actname, working_dir, status_text, actcommit) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def actionedit(req, actid, actname=None, working_dir=None, status_text=None, actcommit="", ln=CFG_SITE_LANG): """Display the details of a WebSubmit action in a Web form so that it can be viewed and/or edited. @param actid: The unique action identifier code. @param actname: name of action (if present, action will be updated, else action details will be displayed) @param working_dir: action working directory for websubmit @param status_text: status text displayed at end of websubmit action @param ln: language @return: page """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_edit_action(actid, actname, working_dir, status_text, actcommit) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def jscheckadd(req, chname=None, chdesc=None, chcommit="", ln=CFG_SITE_LANG): """Add a new JavaScript CHECK to the WebSubmit database. Web form for action details will be displayed if "actid" and "actname" are empty; else new action will be committed to WebSubmit. @param chname: unique name/ID for new check (if empty, Web form will be displayed) @param chdesc: description of new JS check (the JavaScript code that is the check.) (If empty, Web form will be displayed) @param ln: language @return: page """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_add_jscheck(chname, chdesc, chcommit) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def jscheckedit(req, chname, chdesc=None, chcommit="", ln=CFG_SITE_LANG): """Display the details of a WebSubmit checking function in a Web form so that it can be viewed and/or edited. @param chname: The unique Check name/identifier code. @param chdesc: The description of the Check (if present, Check will be updated, else Check details will be displayed) @param ln: language @return: page """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_edit_jscheck(chname, chdesc, chcommit) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def elementadd(req, elname=None, elmarccode=None, eltype=None, elsize=None, elrows=None, elcols=None, elmaxlength=None, \ elval=None, elfidesc=None, elmodifytext=None, elcommit="", ln=CFG_SITE_LANG): """Add a new WebSubmit ELEMENT to the WebSubmit database. @param elname: unique name/ID for new check (if empty, Web form will be displayed) @param elmarccode: MARC Code for element @param eltype: type of element. @param elsize: size of element. @param elrows: number of rows in element. @param elcols: number of columns in element. @param elmaxlength: element maximum length. @param elval: element value. @param elfidesc: element description. @param elmodifytext: Modification text for the element. @param elcommit: flag variable used to determine whether to commit element modifications or whether to simply display a form containing element details. @param ln: language @return: page """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_add_element(elname, elmarccode, eltype, \ elsize, elrows, elcols, elmaxlength, \ elval, elfidesc, elmodifytext, \ elcommit) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def elementedit(req, elname, elmarccode=None, eltype=None, elsize=None, elrows=None, elcols=None, elmaxlength=None, \ elval=None, elfidesc=None, elmodifytext=None, elcommit="", ln=CFG_SITE_LANG): """Display the details of a WebSubmit ELEMENT in a Web form so that it can be viewed and/or edited. @param elname: unique name/ID for new check (if empty, Web form will be displayed) @param elmarccode: MARC Code for element @param eltype: type of element. @param elsize: size of element. @param elrows: number of rows in element. @param elcols: number of columns in element. @param elmaxlength: element maximum length. @param elval: element value. @param elfidesc: element description. @param elmodifytext: Modification text for the element. @param elcommit: flag variable used to determine whether to commit element modifications or whether to simply display a form containing element details. @param ln: language @return: page """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_edit_element(elname, elmarccode, eltype, \ elsize, elrows, elcols, elmaxlength, \ elval, elfidesc, elmodifytext, \ elcommit) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def functionadd(req, funcname=None, funcdescr=None, funcaddcommit="", ln=CFG_SITE_LANG): """Add a new function to WebSubmit""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_add_function(funcname=funcname, funcdescr=funcdescr, funcaddcommit=funcaddcommit ) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def functionedit(req, funcname=None, funcdescr=None, funceditaddparam=None, funceditaddparamfree=None, \ funceditdelparam=None, funcdescreditcommit="", funcparamdelcommit="", funcparamaddcommit="", ln=CFG_SITE_LANG): """Edit a WebSubmit function""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: # Generate content (title, body, errors, warnings) = perform_request_edit_function(funcname=funcname, funcdescr=funcdescr, funceditdelparam=funceditdelparam, funceditaddparam=funceditaddparam, funceditaddparamfree=funceditaddparamfree, funcdescreditcommit=funcdescreditcommit, funcparamdelcommit=funcparamdelcommit, funcparamaddcommit=funcparamaddcommit ) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def functionusage(req, funcname, ln=CFG_SITE_LANG): """View the usage cases (document-type and actions) in which a function is used. @param function: the function name @param ln: the language @return: a web page """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (body, errors, warnings) = perform_request_function_usage(funcname) return page(title = "WebSubmit Function Usage", body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctyperemove(req, doctype="", doctypedelete="", doctypedeleteconfirm="", ln=CFG_SITE_LANG): """Delete a WebSubmit document-type. @param doctype: the unique id of the document type to be deleted @param ln: the interface language @return: HTML page. """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = perform_request_remove_doctype(doctype=doctype, doctypedelete=doctypedelete, doctypedeleteconfirm=doctypedeleteconfirm) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypeadd(req, doctype=None, doctypename=None, doctypedescr=None, clonefrom=None, doctypedetailscommit="", ln=CFG_SITE_LANG): """Add a new document type to WebSubmit""" ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = perform_request_add_doctype(doctype=doctype, doctypename=doctypename, doctypedescr=doctypedescr, clonefrom=clonefrom, doctypedetailscommit=doctypedetailscommit ) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypeconfiguresubmissionpageelements(req, doctype="", action="", pagenum="", movefieldfromposn="", movefieldtoposn="", deletefieldposn="", editfieldposn="", editfieldposncommit="", addfield="", addfieldcommit="", fieldname="", fieldtext="", fieldlevel="", fieldshortdesc="", fieldcheck="", ln=CFG_SITE_LANG): ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = perform_request_configure_doctype_submissionpage_elements(doctype=doctype, action=action, pagenum=pagenum, movefieldfromposn=movefieldfromposn, movefieldtoposn=movefieldtoposn, deletefieldposn=deletefieldposn, editfieldposn=editfieldposn, editfieldposncommit=editfieldposncommit, addfield=addfield, addfieldcommit=addfieldcommit, fieldname=fieldname, fieldtext=fieldtext, fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc, fieldcheck=fieldcheck) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypeconfiguresubmissionpagespreview(req, doctype="", action="", pagenum="", ln=CFG_SITE_LANG): ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = perform_request_configure_doctype_submissionpage_preview(doctype=doctype, action=action, pagenum=pagenum) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypeconfiguresubmissionpages(req, doctype="", action="", pagenum="", movepage="", movepagedirection="", deletepage="", deletepageconfirm="", addpage="", ln=CFG_SITE_LANG ): ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = perform_request_configure_doctype_submissionpages(doctype=doctype, action=action, pagenum=pagenum, movepage=movepage, movepagedirection=movepagedirection, deletepage=deletepage, deletepageconfirm=deletepageconfirm, addpage=addpage) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypeconfiguresubmissionfunctionsparameters(req, doctype="", action="", functionname="", functionstep="", functionscore="", paramname="", paramval="", editfunctionparametervalue="", editfunctionparametervaluecommit="", editfunctionparameterfile="", editfunctionparameterfilecommit="", paramfilename="", paramfilecontent="", ln=CFG_SITE_LANG): """Configure the parameters for a function belonging to a given submission. @param doctype: (string) the unique ID of a document type @param action: (string) the unique ID of an action @param functionname: (string) the name of a WebSubmit function @param functionstep: (integer) the step at which a WebSubmit function is located @param functionscore: (integer) the score (within a step) at which a WebSubmit function is located @param paramname: (string) the name of a parameter being edited @param paramval: (string) the value to be allocated to a parameter that is being editied @param editfunctionparametervalue: (string) a flag to signal that a form should be displayed for editing the value of a parameter @param editfunctionparametervaluecommit: (string) a flag to signal that a parameter value has been edited and should be committed @param editfunctionparameterfile: (string) a flag to signal that a form containing a parameter file is to be displayed @param editfunctionparameterfilecommit: (string) a flag to signal that a modified parameter file is to be committed @param paramfilename: (string) the name of a parameter file @param paramfilecontent: (string) the contents of a parameter file @param ln: (string) the language code (e.g. en, fr, de, etc); defaults to the default installation language @return: (string) HTML-page body """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) =\ perform_request_configure_doctype_submissionfunctions_parameters(doctype=doctype, action=action, functionname=functionname, functionstep=functionstep, functionscore=functionscore, paramname=paramname, paramval=paramval, editfunctionparametervalue=editfunctionparametervalue, editfunctionparametervaluecommit=editfunctionparametervaluecommit, editfunctionparameterfile=editfunctionparameterfile, editfunctionparameterfilecommit=editfunctionparameterfilecommit, paramfilename=paramfilename, paramfilecontent=paramfilecontent) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypeconfiguresubmissionfunctions(req, doctype="", action="", moveupfunctionname="", moveupfunctionstep="", moveupfunctionscore="", movedownfunctionname="", movedownfunctionstep="", movedownfunctionscore="", movefromfunctionname="", movefromfunctionstep="", movefromfunctionscore="", movetofunctionname="", movetofunctionstep="", movetofunctionscore="", deletefunctionname="", deletefunctionstep="", deletefunctionscore="", configuresubmissionaddfunction="", configuresubmissionaddfunctioncommit="", addfunctionname="", addfunctionstep="", addfunctionscore="", ln=CFG_SITE_LANG): ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = \ perform_request_configure_doctype_submissionfunctions(doctype=doctype, action=action, moveupfunctionname=moveupfunctionname, moveupfunctionstep=moveupfunctionstep, moveupfunctionscore=moveupfunctionscore, movedownfunctionname=movedownfunctionname, movedownfunctionstep=movedownfunctionstep, movedownfunctionscore=movedownfunctionscore, movefromfunctionname=movefromfunctionname, movefromfunctionstep=movefromfunctionstep, movefromfunctionscore=movefromfunctionscore, movetofunctionname=movetofunctionname, movetofunctionstep=movetofunctionstep, movetofunctionscore=movetofunctionscore, deletefunctionname=deletefunctionname, deletefunctionstep=deletefunctionstep, deletefunctionscore=deletefunctionscore, configuresubmissionaddfunction=configuresubmissionaddfunction, configuresubmissionaddfunctioncommit=configuresubmissionaddfunctioncommit, addfunctionname=addfunctionname, addfunctionstep=addfunctionstep, addfunctionscore=addfunctionscore) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def doctypeconfigure(req, doctype, doctypename=None, doctypedescr=None, doctypedetailsedit="", doctypedetailscommit="", doctypecategoryadd="", doctypecategoryedit="", doctypecategoryeditcommit="", doctypecategorydelete="", doctypesubmissionadd="", doctypesubmissiondelete="", doctypesubmissiondeleteconfirm="", doctypesubmissionedit="", doctypesubmissionaddclonechosen="", doctypesubmissiondetailscommit="", doctypesubmissionadddetailscommit="", doctypesubmissioneditdetailscommit="", categid=None, categdescr=None, movecategup=None, movecategdown=None, jumpcategout=None, jumpcategin=None, action=None, displayed=None, buttonorder=None, statustext=None, level=None, score=None, stpage=None, endtxt=None, doctype_cloneactionfrom=None, ln=CFG_SITE_LANG): """The main entry point to the configuration of a WebSubmit document type and its submission interfaces, functions, etc. """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = perform_request_configure_doctype(doctype=doctype, doctypename=doctypename, doctypedescr=doctypedescr, doctypedetailsedit=doctypedetailsedit, doctypedetailscommit=doctypedetailscommit, doctypecategoryadd=doctypecategoryadd, doctypecategoryedit=doctypecategoryedit, doctypecategoryeditcommit=doctypecategoryeditcommit, doctypecategorydelete=doctypecategorydelete, doctypesubmissionadd=doctypesubmissionadd, doctypesubmissiondelete=doctypesubmissiondelete, doctypesubmissiondeleteconfirm=doctypesubmissiondeleteconfirm, doctypesubmissionedit=doctypesubmissionedit, doctypesubmissionaddclonechosen=doctypesubmissionaddclonechosen, doctypesubmissionadddetailscommit=doctypesubmissionadddetailscommit, doctypesubmissioneditdetailscommit=doctypesubmissioneditdetailscommit, categid=categid, categdescr=categdescr, movecategup=movecategup, movecategdown=movecategdown, jumpcategout=jumpcategout, jumpcategin=jumpcategin, action=action, displayed=displayed, buttonorder=buttonorder, statustext=statustext, level=level, score=score, stpage=stpage, endtxt=endtxt, doctype_cloneactionfrom=doctype_cloneactionfrom) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) def organisesubmissionpage(req, doctype="", sbmcolid="", catscore="", addsbmcollection="", deletesbmcollection="", addtosbmcollection="", adddoctypes="", movesbmcollectionup="", movesbmcollectiondown="", deletedoctypefromsbmcollection="", movedoctypeupinsbmcollection="", movedoctypedowninsbmcollection="", ln=CFG_SITE_LANG): """Entry point for organising the document types on a submission page. """ ln = wash_language(ln) _ = gettext_set_language(ln) uid = getUid(req) (auth_code, auth_msg) = check_user(req, 'cfgwebsubmit') if not auth_code: ## user is authorised to use WebSubmit Admin: (title, body, errors, warnings) = \ perform_request_organise_submission_page(doctype=doctype, sbmcolid=sbmcolid, catscore=catscore, addsbmcollection=addsbmcollection, deletesbmcollection=deletesbmcollection, addtosbmcollection=addtosbmcollection, adddoctypes=adddoctypes, movesbmcollectionup=movesbmcollectionup, movesbmcollectiondown=movesbmcollectiondown, deletedoctypefromsbmcollection=deletedoctypefromsbmcollection, movedoctypeupinsbmcollection=movedoctypeupinsbmcollection, movedoctypedowninsbmcollection=movedoctypedowninsbmcollection) return page(title = title, body = body, navtrail = get_navtrail(ln), uid = uid, lastupdated = __lastupdated__, req = req, language = ln, errors = errors, warnings = warnings) else: ## user is not authorised to use WebSubmit Admin: return page_not_authorized(req=req, text=auth_msg, navtrail=get_navtrail(ln)) diff --git a/modules/websubmit/web/approve.py b/modules/websubmit/web/approve.py index e1f5073ac..f5e72b4f5 100644 --- a/modules/websubmit/web/approve.py +++ b/modules/websubmit/web/approve.py @@ -1,100 +1,99 @@ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. __revision__ = "$Id$" ## import interesting modules: import string import os import sys import time import types import re import urllib -from mod_python import apache from invenio.config import \ CFG_ACCESS_CONTROL_LEVEL_SITE, \ CFG_SITE_LANG, \ CFG_SITE_NAME, \ CFG_SITE_URL, \ CFG_VERSION, \ CFG_SITE_NAME_INTL from invenio.dbquery import run_sql from invenio.access_control_engine import acc_authorize_action from invenio.access_control_admin import acc_is_role from invenio.websubmit_config import * from invenio.webpage import page, create_error_box from invenio.webuser import getUid, get_email, page_not_authorized from invenio.messages import wash_language, gettext_set_language from invenio.errorlib import register_exception from invenio.urlutils import redirect_to_url def index(req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG): """Approval web Interface. GET params: """ uid = getUid(req) if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1: return page_not_authorized(req, "../approve.py/index", navmenuid='yourapprovals') ln = wash_language(ln) _ = gettext_set_language(ln) form = req.form if form.keys(): # form keys can be a list of 'access pw' and ln, so remove 'ln': for key in form.keys(): if key != 'ln': access = key if access == "": return warningMsg(_("approve.py: cannot determine document reference"), req) res = run_sql("select doctype,rn from sbmAPPROVAL where access=%s",(access,)) if len(res) == 0: return warningMsg(_("approve.py: cannot find document in database"), req) else: doctype = res[0][0] rn = res[0][1] res = run_sql("select value from sbmPARAMETERS where name='edsrn' and doctype=%s",(doctype,)) edsrn = res[0][0] url = "%s/submit/direct?%s" % (CFG_SITE_URL, urllib.urlencode({ edsrn: rn, 'access' : access, 'sub' : 'APP%s' % doctype, 'ln' : ln })) redirect_to_url(req, url) else: return warningMsg(_("Sorry parameter missing..."), req, c, ln) def warningMsg(title, req, c=None, ln=CFG_SITE_LANG): # load the right message language _ = gettext_set_language(ln) if c is None: c = CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME) return page(title = _("Warning"), body = title, description="%s - Internal Error" % c, keywords="%s, Internal Error" % c, uid = getUid(req), language=ln, req=req, navmenuid='submit')