diff --git a/INSTALL b/INSTALL index 955eb531e..3326352ee 100644 --- a/INSTALL +++ b/INSTALL @@ -1,811 +1,818 @@ Invenio INSTALLATION ==================== About ===== This document specifies how to build, customize, and install Invenio v1.0.1 for the first time. See RELEASE-NOTES if you are upgrading from a previous Invenio release. Contents ======== 0. Prerequisites 1. Quick instructions for the impatient Invenio admin 2. Detailed instructions for the patient Invenio admin 0. Prerequisites ================ Here is the software you need to have around before you start installing Invenio: a) Unix-like operating system. The main development and production platforms for Invenio at CERN are GNU/Linux distributions Debian, Gentoo, Scientific Linux (aka RHEL), Ubuntu, but we also develop on Mac OS X. Basically any Unix system supporting the software listed below should do. If you are using Debian GNU/Linux ``Lenny'' or later, then you can install most of the below-mentioned prerequisites and recommendations by running: $ sudo aptitude install python-dev apache2-mpm-prefork \ mysql-server mysql-client python-mysqldb \ python-4suite-xml python-simplejson python-xml \ python-libxml2 python-libxslt1 gnuplot poppler-utils \ gs-common clisp gettext libapache2-mod-wsgi unzip \ python-dateutil python-rdflib \ python-gnuplot python-magic pdftk html2text giflib-tools \ pstotext netpbm python-pypdf python-chardet You may also want to install some of the following packages, if you have them available on your concrete architecture: $ sudo aptitude install python-psyco sbcl cmucl \ pylint pychecker pyflakes python-profiler python-epydoc \ libapache2-mod-xsendfile openoffice.org python-utidylib \ python-beautifulsoup Moreover, you should install some Message Transfer Agent (MTA) such as Postfix so that Invenio can email notification alerts or registration information to the end users, contact moderators and reviewers of submitted documents, inform administrators about various runtime system information, etc: $ sudo aptitude install postfix After running the above-quoted aptitude command(s), you can proceed to configuring your MySQL server instance (max_allowed_packet in my.cnf, see item 0b below) and then to installing the Invenio software package in the section 1 below. If you are using another operating system, then please continue reading the rest of this prerequisites section, and please consult our wiki pages for any concrete hints for your specific operating system. b) MySQL server (may be on a remote machine), and MySQL client (must be available locally too). MySQL versions 4.1 or 5.0 are supported. Please set the variable "max_allowed_packet" in your "my.cnf" init file to at least 4M. (For sites such as INSPIRE, having 1M records with 10M citer-citee pairs in its citation map, you may need to increase max_allowed_packet to 1G.) You may perhaps also want to run your MySQL server natively in UTF-8 mode by setting "default-character-set=utf8" in various parts of your "my.cnf" file, such as in the "[mysql]" part and elsewhere; but this is not really required. c) Apache 2 server, with support for loading DSO modules, and optionally with SSL support for HTTPS-secure user authentication, and mod_xsendfile for off-loading file downloads away from Invenio processes to Apache. d) Python v2.4 or above: as well as the following Python modules: - (mandatory) MySQLdb (version >= 1.2.1_p2; see below) - (recommended) python-dateutil, for complex date processing: - (recommended) PyXML, for XML processing: - (recommended) PyRXP, for very fast XML MARC processing: - (recommended) libxml2-python, for XML/XLST processing: - (recommended) simplejson, for AJAX apps: Note that if you are using Python-2.6, you don't need to install simplejson, because the module is already included in the main Python distribution. - (recommended) Gnuplot.Py, for producing graphs: - (recommended) Snowball Stemmer, for stemming: - (recommended) py-editdist, for record merging: - (recommended) numpy, for citerank methods: - (recommended) magic, for full-text file handling: - (optional) chardet, for character encoding detection: - (optional) 4suite, slower alternative to PyRXP and libxml2-python: - (optional) feedparser, for web journal creation: - (optional) Psyco, if you are running on a 32-bit OS: - (optional) RDFLib, to use RDF ontologies and thesauri: - (optional) mechanize, to run regression web test suite: - (optional) python-mock, mocking library for the test suite: - (optional) hashlib, needed only for Python-2.4 and only if you would like to use AWS connectivity: - (optional) utidylib, for HTML washing: - (optional) Beautiful Soup, for HTML washing: - (optional) Python Twitter (and its dependencies) if you want to use the Twitter Fetcher bibtasklet: Note: MySQLdb version 1.2.1_p2 or higher is recommended. If you are using an older version of MySQLdb, you may get into problems with character encoding. e) mod_wsgi Apache module. Versions 3.x and above are recommended. Note: if you are using Python 2.4 or earlier, then you should also install the wsgiref Python module, available from: (As of Python 2.5 this module is included in standard Python distribution.) f) If you want to be able to extract references from PDF fulltext files, then you need to install pdftotext version 3 at least. g) If you want to be able to search for words in the fulltext files (i.e. to have fulltext indexing) or to stamp submitted files, then you need as well to install some of the following tools: - for Microsoft Office/OpenOffice.org document conversion: OpenOffice.org - for PDF file stamping: pdftk, pdf2ps - for PDF files: pdftotext or pstotext - for PostScript files: pstotext or ps2ascii - for DjVu creation, elaboration: DjVuLibre - to perform OCR: OCRopus (tested only with release 0.3.1) - to perform different image elaborations: ImageMagick - to generate PDF after OCR: netpbm, ReportLab and pyPdf h) If you have chosen to install fast XML MARC Python processors in the step d) above, then you have to install the parsers themselves: - (optional) 4suite: i) (recommended) Gnuplot, the command-line driven interactive plotting program. It is used to display download and citation history graphs on the Detailed record pages on the web interface. Note that Gnuplot must be compiled with PNG output support, that is, with the GD library. Note also that Gnuplot is not required, only recommended. j) (recommended) A Common Lisp implementation, such as CLISP, SBCL or CMUCL. It is used for the web server log analysing tool and the metadata checking program. Note that any of the three implementations CLISP, SBCL, or CMUCL will do. CMUCL produces fastest machine code, but it does not support UTF-8 yet. Pick up CLISP if you don't know what to do. Note that a Common Lisp implementation is not required, only recommended. k) GNU gettext, a set of tools that makes it possible to translate the application in multiple languages. This is available by default on many systems. l) (recommended) xlwt 0.7.2, Library to create spreadsheet files compatible with MS Excel 97/2000/XP/2003 XLS files, on any platform, with Python 2.3 to 2.6 m) (recommended) matplotlib 1.0.0 is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB® or Mathematica®), web application servers, and six graphical user interface toolkits. It is used to generate pie graphs in the custom summary query (WebStat) n) (optional) FFmpeg, an open-source tools an libraries collection to convert video and audio files. It makes use of both internal as well as external libraries to generate videos for the web, such as Theora, WebM and H.264 out of almost any thinkable video input. FFmpeg is needed to run video related modules and submission workflows in Invenio. The minimal configuration of ffmpeg for the Invenio demo site requires a number of external libraries. It is highly recommended to remove all installed versions and packages that are comming with various Linux distributions and install the latest versions from sources. Additionally, you will need the Mediainfo Library for multimedia metadata handling. Minimum libraries for the demo site: - the ffmpeg multimedia encoder tools - a library for jpeg images needed for thumbnail extraction - a library for the ogg container format, needed for Vorbis and Theora - the OGG Vorbis audi codec library - the OGG Theora video codec library - the WebM video codec library - the mediainfo library for multimedia metadata Recommended for H.264 video (!be aware of licensing issues!): - a library for H.264 video encoding - a library for Advanced Audi Coding - a library for MP3 encoding Note that the configure script checks whether you have all the prerequisite software installed and that it won't let you continue unless everything is in order. It also warns you if it cannot find some optional but recommended software. 1. Quick instructions for the impatient Invenio admin ========================================================= 1a. Installation ---------------- $ cd $HOME/src/ $ wget http://invenio-software.org/download/invenio-1.0.1.tar.gz $ wget http://invenio-software.org/download/invenio-1.0.1.tar.gz.md5 $ wget http://invenio-software.org/download/invenio-1.0.1.tar.gz.sig $ md5sum -v -c invenio-1.0.1.tar.gz.md5 $ gpg --verify invenio-1.0.1.tar.gz.sig invenio-1.0.1.tar.gz $ tar xvfz invenio-1.0.1.tar.gz $ cd invenio-1.0.1 $ ./configure $ make $ make install $ make install-mathjax-plugin ## optional $ make install-jquery-plugins ## optional $ make install-ckeditor-plugin ## optional $ make install-pdfa-helper-files ## optional $ make install-mediaelement ## optional 1b. Configuration ----------------- $ sudo chown -R www-data.www-data /opt/invenio $ sudo -u www-data emacs /opt/invenio/etc/invenio-local.conf $ sudo -u www-data /opt/invenio/bin/inveniocfg --update-all $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-tables $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-webstat-conf $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-apache-conf $ sudo /etc/init.d/apache2 restart $ sudo -u www-data /opt/invenio/bin/inveniocfg --check-openoffice $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-demo-site $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-demo-records $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-unit-tests $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-regression-tests $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-web-tests $ sudo -u www-data /opt/invenio/bin/inveniocfg --remove-demo-records $ sudo -u www-data /opt/invenio/bin/inveniocfg --drop-demo-site $ firefox http://your.site.com/help/admin/howto-run 2. Detailed instructions for the patient Invenio admin ========================================================== 2a. Installation ---------------- The Invenio uses standard GNU autoconf method to build and install its files. This means that you proceed as follows: $ cd $HOME/src/ Change to a directory where we will build the Invenio sources. (The built files will be installed into different "target" directories later.) $ wget http://invenio-software.org/download/invenio-1.0.1.tar.gz $ wget http://invenio-software.org/download/invenio-1.0.1.tar.gz.md5 $ wget http://invenio-software.org/download/invenio-1.0.1.tar.gz.sig Fetch Invenio source tarball from the distribution server, together with MD5 checksum and GnuPG cryptographic signature files useful for verifying the integrity of the tarball. $ md5sum -v -c invenio-1.0.1.tar.gz.md5 Verify MD5 checksum. $ gpg --verify invenio-1.0.1.tar.gz.sig invenio-1.0.1.tar.gz Verify GnuPG cryptographic signature. Note that you may first have to import my public key into your keyring, if you haven't done that already: $ gpg --keyserver wwwkeys.eu.pgp.net --recv-keys 0xBA5A2B67 The output of the gpg --verify command should then read: Good signature from "Tibor Simko " You can safely ignore any trusted signature certification warning that may follow after the signature has been successfully verified. $ tar xvfz invenio-1.0.1.tar.gz Untar the distribution tarball. $ cd invenio-1.0.1 Go to the source directory. $ ./configure Configure Invenio software for building on this specific platform. You can use the following optional parameters: --prefix=/opt/invenio Optionally, specify the Invenio general installation directory (default is /opt/invenio). It will contain command-line binaries and program libraries containing the core Invenio functionality, but also store web pages, runtime log and cache information, document data files, etc. Several subdirs like `bin', `etc', `lib', or `var' will be created inside the prefix directory to this effect. Note that the prefix directory should be chosen outside of the Apache htdocs tree, since only one its subdirectory (prefix/var/www) is to be accessible directly via the Web (see below). Note that Invenio won't install to any other directory but to the prefix mentioned in this configuration line. --with-python=/opt/python/bin/python2.4 Optionally, specify a path to some specific Python binary. This is useful if you have more than one Python installation on your system. If you don't set this option, then the first Python that will be found in your PATH will be chosen for running Invenio. --with-mysql=/opt/mysql/bin/mysql Optionally, specify a path to some specific MySQL client binary. This is useful if you have more than one MySQL installation on your system. If you don't set this option, then the first MySQL client executable that will be found in your PATH will be chosen for running Invenio. --with-clisp=/opt/clisp/bin/clisp Optionally, specify a path to CLISP executable. This is useful if you have more than one CLISP installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running Invenio. --with-cmucl=/opt/cmucl/bin/lisp Optionally, specify a path to CMUCL executable. This is useful if you have more than one CMUCL installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running Invenio. --with-sbcl=/opt/sbcl/bin/sbcl Optionally, specify a path to SBCL executable. This is useful if you have more than one SBCL installation on your system. If you don't set this option, then the first executable that will be found in your PATH will be chosen for running Invenio. --with-openoffice-python Optionally, specify the path to the Python interpreter embedded with OpenOffice.org. This is normally not contained in the normal path. If you don't specify this it won't be possible to use OpenOffice.org to convert from and to Microsoft Office and OpenOffice.org documents. This configuration step is mandatory. Usually, you do this step only once. (Note that if you are building Invenio not from a released tarball, but from the Git sources, then you have to generate the configure file via autotools: $ sudo aptitude install automake1.9 autoconf $ aclocal-1.9 $ automake-1.9 -a $ autoconf after which you proceed with the usual configure command.) $ make Launch the Invenio build. Since many messages are printed during the build process, you may want to run it in a fast-scrolling terminal such as rxvt or in a detached screen session. During this step all the pages and scripts will be pre-created and customized based on the config you have edited in the previous step. Note that on systems such as FreeBSD or Mac OS X you have to use GNU make ("gmake") instead of "make". $ make install Install the web pages, scripts, utilities and everything needed for Invenio runtime into respective installation directories, as specified earlier by the configure command. Note that if you are installing Invenio for the first time, you will be asked to create symbolic link(s) from Python's site-packages system-wide directory(ies) to the installation location. This is in order to instruct Python where to find Invenio's Python files. You will be hinted as to the exact command to use based on the parameters you have used in the configure command. $ make install-mathjax-plugin ## optional This will automatically download and install in the proper place MathJax, a JavaScript library to render LaTeX formulas in the client browser. Note that in order to enable the rendering you will have to set the variable CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS in invenio-local.conf to a suitable list of output format codes. For example: CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS = hd,hb $ make install-jquery-plugins ## optional This will automatically download and install in the proper place jQuery and related plugins. They are used for AJAX applications such as the record editor. Note that `unzip' is needed when installing jquery plugins. $ make install-ckeditor-plugin ## optional This will automatically download and install in the proper place CKeditor, a WYSIWYG Javascript-based editor (e.g. for the WebComment module). Note that in order to enable the editor you have to set the CFG_WEBCOMMENT_USE_RICH_EDITOR to True. $ make install-pdfa-helper-files ## optional This will automatically download and install in the proper place the helper files needed to create PDF/A files out of existing PDF files. $ make install-mediaelement ## optional This will automatically download and install the MediaElementJS HTML5 video player that is needed for videos on the DEMO site. $ make install-solrutils ## optional This will automatically download and install a Solr instance which can be used for full-text searching. See CFG_SOLR_URL variable in the invenio.conf. Note that the admin later has to take care of running init.d scripts which would start the Solr instance automatically. 2b. Configuration ----------------- Once the basic software installation is done, we proceed to configuring your Invenio system. $ sudo chown -R www-data.www-data /opt/invenio For the sake of simplicity, let us assume that your Invenio installation will run under the `www-data' user process identity. The above command changes ownership of installed files to www-data, so that we shall run everything under this user identity from now on. For production purposes, you would typically enable Apache server to read all files from the installation place but to write only to the `var' subdirectory of your installation place. You could achieve this by configuring Unix directory group permissions, for example. $ sudo -u www-data emacs /opt/invenio/etc/invenio-local.conf Customize your Invenio installation. Please read the 'invenio.conf' file located in the same directory that contains the vanilla default configuration parameters of your Invenio installation. If you want to customize some of these parameters, you should create a file named 'invenio-local.conf' in the same directory where 'invenio.conf' lives and you should write there only the customizations that you want to be different from the vanilla defaults. Here is a realistic, minimalist, yet production-ready example of what you would typically put there: $ cat /opt/invenio/etc/invenio-local.conf [Invenio] CFG_SITE_NAME = John Doe's Document Server CFG_SITE_NAME_INTL_fr = Serveur des Documents de John Doe CFG_SITE_URL = http://your.site.com CFG_SITE_SECURE_URL = https://your.site.com CFG_SITE_ADMIN_EMAIL = john.doe@your.site.com CFG_SITE_SUPPORT_EMAIL = john.doe@your.site.com CFG_WEBALERT_ALERT_ENGINE_EMAIL = john.doe@your.site.com CFG_WEBCOMMENT_ALERT_ENGINE_EMAIL = john.doe@your.site.com CFG_WEBCOMMENT_DEFAULT_MODERATOR = john.doe@your.site.com CFG_DATABASE_HOST = localhost CFG_DATABASE_NAME = invenio CFG_DATABASE_USER = invenio CFG_DATABASE_PASS = my123p$ss + CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE = 1 You should override at least the parameters mentioned above in order to define some very essential runtime parameters such as the name of your document server (CFG_SITE_NAME and CFG_SITE_NAME_INTL_*), the visible URL of your document server (CFG_SITE_URL and CFG_SITE_SECURE_URL), the email address of the local Invenio administrator, comment moderator, and alert engine (CFG_SITE_SUPPORT_EMAIL, CFG_SITE_ADMIN_EMAIL, etc), and last but not least your database credentials (CFG_DATABASE_*). + If this is a first installation of Invenio it is recommended + you set the CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE + variable to 1. If this is instead an upgrade from an existing + installation don't add it until you have run: + $ bibdocfile --fix-bibdocfsinfo-cache . + The Invenio system will then read both the default invenio.conf file and your customized invenio-local.conf file and it will override any default options with the ones you have specifield in your local file. This cascading of configuration parameters will ease your future upgrades. If you want to have multiple Invenio instances for distributed video encoding, you need to share the same configuration amongs them and make some of the folders of the Invenio installation available for all nodes. Configure the allowed tasks for every node: CFG_BIBSCHED_NODE_TASKS = { "hostname_machine1" : ["bibindex", "bibupload", "bibreformat","webcoll", "bibtaskex", "bibrank", "oaiharvest", "oairepositoryupdater", "inveniogc", "webstatadmin", "bibclassify", "bibexport", "dbdump", "batchuploader", "bibauthorid", "bibtasklet"], "hostname_machine2" : ['bibencode',] } Share the following directories among Invenio instances: /var/tmp-shared hosts video uploads in a temporary form /var/tmp-shared/bibencode/jobs hosts new job files for the video encoding daemon /var/tmp-shared/bibencode/jobs/done hosts job files that have been processed by the daemon /var/data/files hosts fulltext and media files associated to records /var/data/submit hosts files created during submissions $ sudo -u www-data /opt/invenio/bin/inveniocfg --update-all Make the rest of the Invenio system aware of your invenio-local.conf changes. This step is mandatory each time you edit your conf files. $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-tables If you are installing Invenio for the first time, you have to create database tables. Note that this step checks for potential problems such as the database connection rights and may ask you to perform some more administrative steps in case it detects a problem. Notably, it may ask you to set up database access permissions, based on your configure values. If you are installing Invenio for the first time, you have to create a dedicated database on your MySQL server that the Invenio can use for its purposes. Please contact your MySQL administrator and ask him to execute the commands this step proposes you. At this point you should now have successfully completed the "make install" process. We continue by setting up the Apache web server. $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-webstat-conf Load the configuration file of webstat module. It will create the tables in the database for register customevents, such as basket hits. $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-apache-conf Running this command will generate Apache virtual host configurations matching your installation. You will be instructed to check created files (usually they are located under /opt/invenio/etc/apache/) and edit your httpd.conf to activate Invenio virtual hosts. If you are using Debian GNU/Linux ``Lenny'' or later, then you can do the following to create your SSL certificate and to activate your Invenio vhosts: ## make SSL certificate: $ sudo aptitude install ssl-cert $ sudo mkdir /etc/apache2/ssl $ sudo /usr/sbin/make-ssl-cert /usr/share/ssl-cert/ssleay.cnf \ /etc/apache2/ssl/apache.pem ## add Invenio web sites: $ sudo ln -s /opt/invenio/etc/apache/invenio-apache-vhost.conf \ /etc/apache2/sites-available/invenio $ sudo ln -s /opt/invenio/etc/apache/invenio-apache-vhost-ssl.conf \ /etc/apache2/sites-available/invenio-ssl ## disable Debian's default web site: $ sudo /usr/sbin/a2dissite default ## enable Invenio web sites: $ sudo /usr/sbin/a2ensite invenio $ sudo /usr/sbin/a2ensite invenio-ssl ## enable SSL module: $ sudo /usr/sbin/a2enmod ssl ## if you are using xsendfile module, enable it too: $ sudo /usr/sbin/a2enmod xsendfile If you are using another operating system, you should do the equivalent, for example edit your system-wide httpd.conf and put the following include statements: Include /opt/invenio/etc/apache/invenio-apache-vhost.conf Include /opt/invenio/etc/apache/invenio-apache-vhost-ssl.conf Note that you may need to adapt generated vhost file snippets to match your concrete operating system specifics. For example, the generated configuration snippet will preload Invenio WSGI daemon application upon Apache start up for faster site response. The generated configuration assumes that you are using mod_wsgi version 3 or later. If you are using the old legacy mod_wsgi version 2, then you would need to comment out the WSGIImportScript directive from the generated snippet, or else move the WSGI daemon setup to the top level, outside of the VirtualHost section. Note also that you may want to tweak the generated Apache vhost snippet for performance reasons, especially with respect to WSGIDaemonProcess parameters. For example, you can increase the number of processes from the default value `processes=5' if you have lots of RAM and if many concurrent users may access your site in parallel. However, note that you must use `threads=1' there, because Invenio WSGI daemon processes are not fully thread safe yet. This may change in the future. $ sudo /etc/init.d/apache2 restart Please ask your webserver administrator to restart the Apache server after the above "httpd.conf" changes. $ sudo -u www-data /opt/invenio/bin/inveniocfg --check-openoffice If you plan to support MS Office or Open Document Format files in your installation, you should check whether LibreOffice or OpenOffice.org is well integrated with Invenio by running the above command. You may be asked to create a temporary directory for converting office files with special ownership (typically as user nobody) and permissions. Note that you can do this step later. $ sudo -u www-data /opt/invenio/bin/inveniocfg --create-demo-site This step is recommended to test your local Invenio installation. It should give you our "Atlantis Institute of Science" demo installation, exactly as you see it at . $ sudo -u www-data /opt/invenio/bin/inveniocfg --load-demo-records Optionally, load some demo records to be able to test indexing and searching of your local Invenio demo installation. $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-unit-tests Optionally, you can run the unit test suite to verify the unit behaviour of your local Invenio installation. Note that this command should be run only after you have installed the whole system via `make install'. $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-regression-tests Optionally, you can run the full regression test suite to verify the functional behaviour of your local Invenio installation. Note that this command requires to have created the demo site and loaded the demo records. Note also that running the regression test suite may alter the database content with junk data, so that rebuilding the demo site is strongly recommended afterwards. $ sudo -u www-data /opt/invenio/bin/inveniocfg --run-web-tests Optionally, you can run additional automated web tests running in a real browser. This requires to have Firefox with the Selenium IDE extension installed. $ sudo -u www-data /opt/invenio/bin/inveniocfg --remove-demo-records Optionally, remove the demo records loaded in the previous step, but keeping otherwise the demo collection, submission, format, and other configurations that you may reuse and modify for your own production purposes. $ sudo -u www-data /opt/invenio/bin/inveniocfg --drop-demo-site Optionally, drop also all the demo configuration so that you'll end up with a completely blank Invenio system. However, you may want to find it more practical not to drop the demo site configuration but to start customizing from there. $ firefox http://your.site.com/help/admin/howto-run In order to start using your Invenio installation, you can start indexing, formatting and other daemons as indicated in the "HOWTO Run" guide on the above URL. You can also use the Admin Area web interfaces to perform further runtime configurations such as the definition of data collections, document types, document formats, word indexes, etc. $ sudo ln -s /opt/invenio/etc/bash_completion.d/inveniocfg \ /etc/bash_completion.d/inveniocfg Optionally, if you are using Bash shell completion, then you may want to create the above symlink in order to configure completion for the inveniocfg command. Good luck, and thanks for choosing Invenio. - Invenio Development Team diff --git a/RELEASE-NOTES b/RELEASE-NOTES index f2b111cea..1e758c9c3 100644 --- a/RELEASE-NOTES +++ b/RELEASE-NOTES @@ -1,100 +1,101 @@ -------------------------------------------------------------------- Invenio v1.0.1 is released June 28, 2012 http://invenio-software.org/ -------------------------------------------------------------------- Invenio v1.0.1 was released on June 28, 2012. This is a minor bugfix release only. It is recommended to all Invenio sites using v1.0.0 or previous releases. What's new: ----------- *) BibFormat: fix format validation report; fix opensearch prefix exclusion in RSS; fix retrieval of collection identifier *) BibIndex: new unit tests for the Greek stemmer *) BibSched: improve low level submission arg parsing; set ERROR status when wrong params; task can stop immediately when sleeping *) BibSword: remove dangling documentation *) BibUpload: fix setting restriction in -a/-ir modes *) WebAlert: simplify HTML markup *) WebComment: only logged users to use report abuse *) WebJournal: hide deleted records *) WebSearch: adapt test cases for citation summary; fix collection order on the search page; look at access control when webcolling; sorting in citesummary breakdown links *) WebSession: simplify HTML markup *) WebSubmit: capitalise doctypes in Doc File Manager; check authorizations in endaction; check for problems when archiving; ensure unique tmp file name for upload; fix email formatting; fix Move_to_Done function; remove 8564_ field from demo templates; skip file upload if necessary; update CERN-specific config *) bibdocfile: BibRecDocs recID argument type check *) data cacher: deletes cache before refilling it *) dbquery: fix dbexec CLI WRT max allowed packet *) I18N: updates to Greek translation *) installation: fix circular install-jquery-plugins; fix demo user initialisation; fix jQuery tablesorter download URL; fix jQuery uploadify download URL; more info about max_allowed_packet; remove unneeded rxp binary package Download: --------- Installation notes: ------------------- Please follow the INSTALL file bundled in the distribution tarball. Upgrade notes: -------------- If you are upgrading from Invenio v1.0.0, then: a) Stop your bibsched queue and your Apache server. b) Install the update: $ tar xvfz invenio-1.0.1.tar.gz $ cd invenio-1.0.1 $ sudo rsync -a /opt/invenio/etc/ /opt/invenio/etc.OLD/ $ sh /opt/invenio/etc/build/config.nice $ make $ make check-custom-templates $ sudo -u www-data make install $ sudo -u www-data /opt/invenio/bin/inveniocfg --update-all $ sudo rsync -a /opt/invenio/etc.OLD/ \ --exclude bibformat/format_templates/RSS.xsl \ --exclude bibconvert/config/DEMOBOOcreate.tpl \ --exclude bibconvert/config/DEMOPICcreate.tpl \ --exclude bibconvert/config/DEMOTHEcreate.tpl \ /opt/invenio/etc/ + FIXME: add note about need to run bibdocfile --fix-bibdocfsinfo-cache c) Restart your Apache server and your bibsched queue. If you are upgrading from a previous Invenio release (notably from v0.99 release series), then please see a dedicated Invenio Upgrade wiki page at . - end of file - \ No newline at end of file diff --git a/config/invenio.conf b/config/invenio.conf index 71f8744c4..b7d16b724 100644 --- a/config/invenio.conf +++ b/config/invenio.conf @@ -1,1879 +1,1886 @@ ## This file is part of Invenio. ## Copyright (C) 2008, 2009, 2010, 2011, 2012 CERN. ## ## Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. ################################################### ## About 'invenio.conf' and 'invenio-local.conf' ## ################################################### ## The 'invenio.conf' file contains the vanilla default configuration ## parameters of a Invenio installation, as coming out of the ## distribution. The file should be self-explanatory. Once installed ## in its usual location (usually /opt/invenio/etc), you could in ## principle go ahead and change the values according to your local ## needs, but this is not advised. ## ## If you would like to customize some of these parameters, you should ## rather create a file named 'invenio-local.conf' in the same ## directory where 'invenio.conf' lives and you should write there ## only the customizations that you want to be different from the ## vanilla defaults. ## ## Here is a realistic, minimalist, yet production-ready example of ## what you would typically put there: ## ## $ cat /opt/invenio/etc/invenio-local.conf ## [Invenio] ## CFG_SITE_NAME = John Doe's Document Server ## CFG_SITE_NAME_INTL_fr = Serveur des Documents de John Doe ## CFG_SITE_URL = http://your.site.com ## CFG_SITE_SECURE_URL = https://your.site.com ## CFG_SITE_ADMIN_EMAIL = john.doe@your.site.com ## CFG_SITE_SUPPORT_EMAIL = john.doe@your.site.com ## CFG_WEBALERT_ALERT_ENGINE_EMAIL = john.doe@your.site.com ## CFG_WEBCOMMENT_ALERT_ENGINE_EMAIL = john.doe@your.site.com ## CFG_WEBCOMMENT_DEFAULT_MODERATOR = john.doe@your.site.com ## CFG_DATABASE_HOST = localhost ## CFG_DATABASE_NAME = invenio ## CFG_DATABASE_USER = invenio ## CFG_DATABASE_PASS = my123p$ss ## ## You should override at least the parameters mentioned above and the ## parameters mentioned in the `Part 1: Essential parameters' below in ## order to define some very essential runtime parameters such as the ## name of your document server (CFG_SITE_NAME and ## CFG_SITE_NAME_INTL_*), the visible URL of your document server ## (CFG_SITE_URL and CFG_SITE_SECURE_URL), the email address of the ## local Invenio administrator, comment moderator, and alert engine ## (CFG_SITE_SUPPORT_EMAIL, CFG_SITE_ADMIN_EMAIL, etc), and last but ## not least your database credentials (CFG_DATABASE_*). ## ## The Invenio system will then read both the default invenio.conf ## file and your customized invenio-local.conf file and it will ## override any default options with the ones you have specified in ## your local file. This cascading of configuration parameters will ## ease your future upgrades. [Invenio] ################################### ## Part 1: Essential parameters ## ################################### ## This part defines essential Invenio internal parameters that ## everybody should override, like the name of the server or the email ## address of the local Invenio administrator. ## CFG_DATABASE_* - specify which MySQL server to use, the name of the ## database to use, and the database access credentials. CFG_DATABASE_HOST = localhost CFG_DATABASE_PORT = 3306 CFG_DATABASE_NAME = invenio CFG_DATABASE_USER = invenio CFG_DATABASE_PASS = my123p$ss ## CFG_DATABASE_SLAVE - if you use DB replication, then specify the DB ## slave address credentials. (Assuming the same access rights to the ## DB slave as to the DB master.) If you don't use DB replication, ## then leave this option blank. CFG_DATABASE_SLAVE = ## CFG_SITE_URL - specify URL under which your installation will be ## visible. For example, use "http://your.site.com". Do not leave ## trailing slash. CFG_SITE_URL = http://localhost ## CFG_SITE_SECURE_URL - specify secure URL under which your ## installation secure pages such as login or registration will be ## visible. For example, use "https://your.site.com". Do not leave ## trailing slash. If you don't plan on using HTTPS, then you may ## leave this empty. CFG_SITE_SECURE_URL = https://localhost ## CFG_SITE_NAME -- the visible name of your Invenio installation. CFG_SITE_NAME = Atlantis Institute of Fictive Science ## CFG_SITE_NAME_INTL -- the international versions of CFG_SITE_NAME ## in various languages. (See also CFG_SITE_LANGS below.) CFG_SITE_NAME_INTL_en = Atlantis Institute of Fictive Science CFG_SITE_NAME_INTL_fr = Atlantis Institut des Sciences Fictives CFG_SITE_NAME_INTL_de = Atlantis Institut der fiktiven Wissenschaft CFG_SITE_NAME_INTL_es = Atlantis Instituto de la Ciencia Fictive CFG_SITE_NAME_INTL_ca = Institut Atlantis de Ciència Fictícia CFG_SITE_NAME_INTL_pt = Instituto Atlantis de Ciência Fictícia CFG_SITE_NAME_INTL_it = Atlantis Istituto di Scienza Fittizia CFG_SITE_NAME_INTL_ru = Институт Фиктивных Наук Атлантиды CFG_SITE_NAME_INTL_sk = Atlantis Inštitút Fiktívnych Vied CFG_SITE_NAME_INTL_cs = Atlantis Institut Fiktivních Věd CFG_SITE_NAME_INTL_no = Atlantis Institutt for Fiktiv Vitenskap CFG_SITE_NAME_INTL_sv = Atlantis Institut för Fiktiv Vetenskap CFG_SITE_NAME_INTL_el = Ινστιτούτο Φανταστικών Επιστημών Ατλαντίδος CFG_SITE_NAME_INTL_uk = Інститут вигаданих наук в Атлантісі CFG_SITE_NAME_INTL_ja = Fictive 科学のAtlantis の協会 CFG_SITE_NAME_INTL_pl = Instytut Fikcyjnej Nauki Atlantis CFG_SITE_NAME_INTL_bg = Институт за фиктивни науки Атлантис CFG_SITE_NAME_INTL_hr = Institut Fiktivnih Znanosti Atlantis CFG_SITE_NAME_INTL_zh_CN = 阿特兰提斯虚拟科学学院 CFG_SITE_NAME_INTL_zh_TW = 阿特蘭提斯虛擬科學學院 CFG_SITE_NAME_INTL_hu = Kitalált Tudományok Atlantiszi Intézete CFG_SITE_NAME_INTL_af = Atlantis Instituut van Fiktiewe Wetenskap CFG_SITE_NAME_INTL_gl = Instituto Atlantis de Ciencia Fictive CFG_SITE_NAME_INTL_ro = Institutul Atlantis al Ştiinţelor Fictive CFG_SITE_NAME_INTL_rw = Atlantis Ishuri Rikuru Ry'ubuhanga CFG_SITE_NAME_INTL_ka = ატლანტიდის ფიქტიური მეცნიერების ინსტიტუტი CFG_SITE_NAME_INTL_lt = Fiktyvių Mokslų Institutas Atlantis CFG_SITE_NAME_INTL_ar = معهد أطلنطيس للعلوم الافتراضية ## CFG_SITE_LANG -- the default language of the interface: ' CFG_SITE_LANG = en ## CFG_SITE_LANGS -- list of all languages the user interface should ## be available in, separated by commas. The order specified below ## will be respected on the interface pages. A good default would be ## to use the alphabetical order. Currently supported languages ## include Afrikaans, Arabic, Bulgarian, Catalan, Czech, German, Georgian, ## Greek, English, Spanish, French, Croatian, Hungarian, Galician, ## Italian, Japanese, Kinyarwanda, Lithuanian, Norwegian, Polish, ## Portuguese, Romanian, Russian, Slovak, Swedish, Ukrainian, Chinese ## (China), Chinese (Taiwan), so that the eventual maximum you can ## currently select is ## "af,ar,bg,ca,cs,de,el,en,es,fr,hr,gl,ka,it,rw,lt,hu,ja,no,pl,pt,ro,ru,sk,sv,uk,zh_CN,zh_TW". CFG_SITE_LANGS = af,ar,bg,ca,cs,de,el,en,es,fr,hr,gl,ka,it,rw,lt,hu,ja,no,pl,pt,ro,ru,sk,sv,uk,zh_CN,zh_TW ## CFG_SITE_SUPPORT_EMAIL -- the email address of the support team for ## this installation: CFG_SITE_SUPPORT_EMAIL = info@invenio-software.org ## CFG_SITE_ADMIN_EMAIL -- the email address of the 'superuser' for ## this installation. Enter your email address below and login with ## this address when using Invenio inistration modules. You ## will then be automatically recognized as superuser of the system. CFG_SITE_ADMIN_EMAIL = info@invenio-software.org ## CFG_SITE_EMERGENCY_EMAIL_ADDRESSES -- list of email addresses to ## which an email should be sent in case of emergency (e.g. bibsched ## queue has been stopped because of an error). Configuration ## dictionary allows for different recipients based on weekday and ## time-of-day. Example: ## ## CFG_SITE_EMERGENCY_EMAIL_ADDRESSES = { ## 'Sunday 22:00-06:00': '0041761111111@email2sms.foo.com', ## '06:00-18:00': 'team-in-europe@foo.com,0041762222222@email2sms.foo.com', ## '18:00-06:00': 'team-in-usa@foo.com', ## '*': 'john.doe.phone@foo.com'} ## ## If you want the emergency email notifications to always go to the ## same address, just use the wildcard line in the above example. CFG_SITE_EMERGENCY_EMAIL_ADDRESSES = {} ## CFG_SITE_ADMIN_EMAIL_EXCEPTIONS -- set this to 0 if you do not want ## to receive any captured exception via email to CFG_SITE_ADMIN_EMAIL ## address. Captured exceptions will still be available in ## var/log/invenio.err file. Set this to 1 if you want to receive ## some of the captured exceptions (this depends on the actual place ## where the exception is captured). Set this to 2 if you want to ## receive all captured exceptions. CFG_SITE_ADMIN_EMAIL_EXCEPTIONS = 1 ## CFG_SITE_RECORD -- what is the URI part representing detailed ## record pages? We recomment to leave the default value `record' ## unchanged. CFG_SITE_RECORD = record ## CFG_ERRORLIB_RESET_EXCEPTION_NOTIFICATION_COUNTER_AFTER -- set this to ## the number of seconds after which to reset the exception notification ## counter. A given repetitive exception is notified via email with a ## logarithmic strategy: the first time it is seen it is sent via email, ## then the second time, then the fourth, then the eighth and so forth. ## If the number of seconds elapsed since the last time it was notified ## is greater than CFG_ERRORLIB_RESET_EXCEPTION_NOTIFICATION_COUNTER_AFTER ## then the internal counter is reset in order not to have exception ## notification become more and more rare. CFG_ERRORLIB_RESET_EXCEPTION_NOTIFICATION_COUNTER_AFTER = 14400 ## CFG_CERN_SITE -- do we want to enable CERN-specific code? ## Put "1" for "yes" and "0" for "no". CFG_CERN_SITE = 0 ## CFG_INSPIRE_SITE -- do we want to enable INSPIRE-specific code? ## Put "1" for "yes" and "0" for "no". CFG_INSPIRE_SITE = 0 ## CFG_ADS_SITE -- do we want to enable ADS-specific code? ## Put "1" for "yes" and "0" for "no". CFG_ADS_SITE = 0 ## CFG_OPENAIRE_SITE -- do we want to enable OpenAIRE-specific code? ## Put "1" for "yes" and "0" for "no". CFG_OPENAIRE_SITE = 0 ## CFG_DEVEL_SITE -- is this a development site? If it is, you might ## prefer that it does not do certain things. For example, you might ## not want WebSubmit to send certain emails or trigger certain ## processes on a development site. ## Put "1" for "yes" (this is a development site) or "0" for "no" ## (this isn't a development site.) CFG_DEVEL_SITE = 0 ################################ ## Part 2: Web page style ## ################################ ## The variables affecting the page style. The most important one is ## the 'template skin' you would like to use and the obfuscation mode ## for your email addresses. Please refer to the WebStyle Admin Guide ## for more explanation. The other variables are listed here mostly ## for backwards compatibility purposes only. ## CFG_WEBSTYLE_TEMPLATE_SKIN -- what template skin do you want to ## use? CFG_WEBSTYLE_TEMPLATE_SKIN = default ## CFG_WEBSTYLE_EMAIL_ADDRESSES_OBFUSCATION_MODE. How do we "protect" ## email addresses from undesired automated email harvesters? This ## setting will not affect 'support' and 'admin' emails. ## NOTE: there is no ultimate solution to protect against email ## harvesting. All have drawbacks and can more or less be ## circumvented. Choose you preferred mode ([t] means "transparent" ## for the user): ## -1: hide all emails. ## [t] 0 : no protection, email returned as is. ## foo@example.com => foo@example.com ## 1 : basic email munging: replaces @ by [at] and . by [dot] ## foo@example.com => foo [at] example [dot] com ## [t] 2 : transparent name mangling: characters are replaced by ## equivalent HTML entities. ## foo@example.com => foo@example.com ## [t] 3 : javascript insertion. Requires Javascript enabled on client ## side. ## 4 : replaces @ and . characters by gif equivalents. ## foo@example.com => foo [at] example [dot] com CFG_WEBSTYLE_EMAIL_ADDRESSES_OBFUSCATION_MODE = 2 ## CFG_WEBSTYLE_INSPECT_TEMPLATES -- Do we want to debug all template ## functions so that they would return HTML results wrapped in ## comments indicating which part of HTML page was created by which ## template function? Useful only for debugging Pythonic HTML ## template. See WebStyle Admin Guide for more information. CFG_WEBSTYLE_INSPECT_TEMPLATES = 0 ## (deprecated) CFG_WEBSTYLE_CDSPAGEBOXLEFTTOP -- eventual global HTML ## left top box: CFG_WEBSTYLE_CDSPAGEBOXLEFTTOP = ## (deprecated) CFG_WEBSTYLE_CDSPAGEBOXLEFTBOTTOM -- eventual global ## HTML left bottom box: CFG_WEBSTYLE_CDSPAGEBOXLEFTBOTTOM = ## (deprecated) CFG_WEBSTYLE_CDSPAGEBOXRIGHTTOP -- eventual global ## HTML right top box: CFG_WEBSTYLE_CDSPAGEBOXRIGHTTOP = ## (deprecated) CFG_WEBSTYLE_CDSPAGEBOXRIGHTBOTTOM -- eventual global ## HTML right bottom box: CFG_WEBSTYLE_CDSPAGEBOXRIGHTBOTTOM = ## CFG_WEBSTYLE_HTTP_STATUS_ALERT_LIST -- when certain HTTP status ## codes are raised to the WSGI handler, the corresponding exceptions ## and error messages can be sent to the system administrator for ## inspecting. This is useful to detect and correct errors. The ## variable represents a comma-separated list of HTTP statuses that ## should alert admin. Wildcards are possible. If the status is ## followed by an "r", it means that a referer is required to exist ## (useful to distinguish broken known links from URL typos when 404 ## errors are raised). CFG_WEBSTYLE_HTTP_STATUS_ALERT_LIST = 404r,400,5*,41* ## CFG_WEBSTYLE_HTTP_USE_COMPRESSION -- whether to enable deflate ## compression of your HTTP/HTTPS connections. This will affect the Apache ## configuration snippets created by inveniocfg --create-apache-conf and ## the OAI-PMH Identify response. CFG_WEBSTYLE_HTTP_USE_COMPRESSION = 0 ## CFG_WEBSTYLE_REVERSE_PROXY_IPS -- if you are setting a multinode ## environment where an HTTP proxy such as mod_proxy is sitting in ## front of the Invenio web application and is forwarding requests to ## worker nodes, set here the the list of IP addresses of the allowed ## HTTP proxies. This is needed in order to avoid IP address spoofing ## when worker nodes are also available on the public Internet and ## might receive forged HTTP requests. Only HTTP requests coming from ## the specified IP addresses will be considered as forwarded from a ## reverse proxy. E.g. set this to '123.123.123.123'. CFG_WEBSTYLE_REVERSE_PROXY_IPS = ################################## ## Part 3: WebSearch parameters ## ################################## ## This section contains some configuration parameters for WebSearch ## module. Please note that WebSearch is mostly configured on ## run-time via its WebSearch Admin web interface. The parameters ## below are the ones that you do not probably want to modify very ## often during the runtime. (Note that you may modify them ## afterwards too, though.) ## CFG_WEBSEARCH_SEARCH_CACHE_SIZE -- how many queries we want to ## cache in memory per one Apache httpd process? This cache is used ## mainly for "next/previous page" functionality, but it caches also ## "popular" user queries if more than one user happen to search for ## the same thing. Note that large numbers may lead to great memory ## consumption. We recommend a value not greater than 100. CFG_WEBSEARCH_SEARCH_CACHE_SIZE = 0 ## CFG_WEBSEARCH_FIELDS_CONVERT -- if you migrate from an older ## system, you may want to map field codes of your old system (such as ## 'ti') to Invenio/MySQL ("title"). Use Python dictionary syntax ## for the translation table, e.g. {'wau':'author', 'wti':'title'}. ## Usually you don't want to do that, and you would use empty dict {}. CFG_WEBSEARCH_FIELDS_CONVERT = {} ## CFG_WEBSEARCH_LIGHTSEARCH_PATTERN_BOX_WIDTH -- width of the ## search pattern window in the light search interface, in ## characters. CFG_WEBSEARCH_LIGHTSEARCH_PATTERN_BOX_WIDTH = 60 CFG_WEBSEARCH_LIGHTSEARCH_PATTERN_BOX_WIDTH = 60 ## CFG_WEBSEARCH_SIMPLESEARCH_PATTERN_BOX_WIDTH -- width of the search ## pattern window in the simple search interface, in characters. CFG_WEBSEARCH_SIMPLESEARCH_PATTERN_BOX_WIDTH = 40 ## CFG_WEBSEARCH_ADVANCEDSEARCH_PATTERN_BOX_WIDTH -- width of the ## search pattern window in the advanced search interface, in ## characters. CFG_WEBSEARCH_ADVANCEDSEARCH_PATTERN_BOX_WIDTH = 30 ## CFG_WEBSEARCH_NB_RECORDS_TO_SORT -- how many records do we still ## want to sort? For higher numbers we print only a warning and won't ## perform any sorting other than default 'latest records first', as ## sorting would be very time consuming then. We recommend a value of ## not more than a couple of thousands. CFG_WEBSEARCH_NB_RECORDS_TO_SORT = 1000 ## CFG_WEBSEARCH_CALL_BIBFORMAT -- if a record is being displayed but ## it was not preformatted in the "HTML brief" format, do we want to ## call BibFormatting on the fly? Put "1" for "yes" and "0" for "no". ## Note that "1" will display the record exactly as if it were fully ## preformatted, but it may be slow due to on-the-fly processing; "0" ## will display a default format very fast, but it may not have all ## the fields as in the fully preformatted HTML brief format. Note ## also that this option is active only for old (PHP) formats; the new ## (Python) formats are called on the fly by default anyway, since ## they are much faster. When usure, please set "0" here. CFG_WEBSEARCH_CALL_BIBFORMAT = 0 ## CFG_WEBSEARCH_USE_ALEPH_SYSNOS -- do we want to make old SYSNOs ## visible rather than MySQL's record IDs? You may use this if you ## migrate from a different e-doc system, and you store your old ## system numbers into 970__a. Put "1" for "yes" and "0" for ## "no". Usually you don't want to do that, though. CFG_WEBSEARCH_USE_ALEPH_SYSNOS = 0 ## CFG_WEBSEARCH_I18N_LATEST_ADDITIONS -- Put "1" if you want the ## "Latest Additions" in the web collection pages to show ## internationalized records. Useful only if your brief BibFormat ## templates contains internationalized strings. Otherwise put "0" in ## order not to slow down the creation of latest additions by WebColl. CFG_WEBSEARCH_I18N_LATEST_ADDITIONS = 0 ## CFG_WEBSEARCH_INSTANT_BROWSE -- the number of records to display ## under 'Latest Additions' in the web collection pages. CFG_WEBSEARCH_INSTANT_BROWSE = 10 ## CFG_WEBSEARCH_INSTANT_BROWSE_RSS -- the number of records to ## display in the RSS feed. CFG_WEBSEARCH_INSTANT_BROWSE_RSS = 25 ## CFG_WEBSEARCH_RSS_I18N_COLLECTIONS -- comma-separated list of ## collections that feature an internationalized RSS feed on their ## main seach interface page created by webcoll. Other collections ## will have RSS feed using CFG_SITE_LANG. CFG_WEBSEARCH_RSS_I18N_COLLECTIONS = ## CFG_WEBSEARCH_RSS_TTL -- number of minutes that indicates how long ## a feed cache is valid. CFG_WEBSEARCH_RSS_TTL = 360 ## CFG_WEBSEARCH_RSS_MAX_CACHED_REQUESTS -- maximum number of request kept ## in cache. If the cache is filled, following request are not cached. CFG_WEBSEARCH_RSS_MAX_CACHED_REQUESTS = 1000 ## CFG_WEBSEARCH_AUTHOR_ET_AL_THRESHOLD -- up to how many author names ## to print explicitely; for more print "et al". Note that this is ## used in default formatting that is seldomly used, as usually ## BibFormat defines all the format. The value below is only used ## when BibFormat fails, for example. CFG_WEBSEARCH_AUTHOR_ET_AL_THRESHOLD = 3 ## CFG_WEBSEARCH_NARROW_SEARCH_SHOW_GRANDSONS -- whether to show or ## not collection grandsons in Narrow Search boxes (sons are shown by ## default, grandsons are configurable here). Use 0 for no and 1 for ## yes. CFG_WEBSEARCH_NARROW_SEARCH_SHOW_GRANDSONS = 1 ## CFG_WEBSEARCH_CREATE_SIMILARLY_NAMED_AUTHORS_LINK_BOX -- shall we ## create help links for Ellis, Nick or Ellis, Nicholas and friends ## when Ellis, N was searched for? Useful if you have one author ## stored in the database under several name formats, namely surname ## comma firstname and surname comma initial cataloging policy. Use 0 ## for no and 1 for yes. CFG_WEBSEARCH_CREATE_SIMILARLY_NAMED_AUTHORS_LINK_BOX = 1 ## CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS -- MathJax is a JavaScript ## library that renders (La)TeX mathematical formulas in the client ## browser. This parameter must contain a comma-separated list of ## output formats for which to apply the MathJax rendering, for example ## "hb,hd". If the list is empty, MathJax is disabled. CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS = ## CFG_WEBSEARCH_EXTERNAL_COLLECTION_SEARCH_TIMEOUT -- when searching ## external collections (e.g. SPIRES, CiteSeer, etc), how many seconds ## do we wait for reply before abandonning? CFG_WEBSEARCH_EXTERNAL_COLLECTION_SEARCH_TIMEOUT = 5 ## CFG_WEBSEARCH_EXTERNAL_COLLECTION_SEARCH_MAXRESULTS -- how many ## results do we fetch? CFG_WEBSEARCH_EXTERNAL_COLLECTION_SEARCH_MAXRESULTS = 10 ## CFG_WEBSEARCH_SPLIT_BY_COLLECTION -- do we want to split the search ## results by collection or not? Use 0 for not, 1 for yes. CFG_WEBSEARCH_SPLIT_BY_COLLECTION = 1 ## CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS -- the default number of ## records to display per page in the search results pages. CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS = 10 ## CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS -- in order to limit denial of ## service attacks the total number of records per group displayed as a ## result of a search query will be limited to this number. Only the superuser ## queries will not be affected by this limit. CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS = 200 ## CFG_WEBSEARCH_PERMITTED_RESTRICTED_COLLECTIONS_LEVEL -- logged in users ## might have rights to access some restricted collections. This variable ## tweaks the kind of support the system will automatically provide to the ## user with respect to searching into these restricted collections. ## Set this to 0 in order to have the user to explicitly activate restricted ## collections in order to search into them. Set this to 1 in order to ## propose to the user the list of restricted collections to which he/she has ## rights (note: this is not yet implemented). Set this to 2 in order to ## silently add all the restricted collections to which the user has rights to ## to any query. ## Note: the system will discover which restricted collections a user has ## rights to, at login time. The time complexity of this procedure is ## proportional to the number of restricted collections. E.g. for a system ## with ~50 restricted collections, you might expect ~1s of delay in the ## login time, when this variable is set to a value higher than 0. CFG_WEBSEARCH_PERMITTED_RESTRICTED_COLLECTIONS_LEVEL = 0 ## CFG_WEBSEARCH_SHOW_COMMENT_COUNT -- do we want to show the 'N comments' ## links on the search engine pages? (useful only when you have allowed ## commenting) CFG_WEBSEARCH_SHOW_COMMENT_COUNT = 1 ## CFG_WEBSEARCH_SHOW_REVIEW_COUNT -- do we want to show the 'N reviews' ## links on the search engine pages? (useful only when you have allowed ## reviewing) CFG_WEBSEARCH_SHOW_REVIEW_COUNT = 1 ## CFG_WEBSEARCH_FULLTEXT_SNIPPETS -- how many full-text snippets to ## display for full-text searches? CFG_WEBSEARCH_FULLTEXT_SNIPPETS = 4 ## CFG_WEBSEARCH_FULLTEXT_SNIPPETS_WORDS -- how many context words ## to display around the pattern in the snippet? CFG_WEBSEARCH_FULLTEXT_SNIPPETS_WORDS = 4 ## CFG_WEBSEARCH_WILDCARD_LIMIT -- some of the queries, wildcard ## queries in particular (ex: cern*, a*), but also regular expressions ## (ex: [a-z]+), may take a long time to respond due to the high ## number of hits. You can limit the number of terms matched by a ## wildcard by setting this variable. A negative value or zero means ## that none of the queries will be limited (which may be wanted by ## also prone to denial-of-service kind of attacks). CFG_WEBSEARCH_WILDCARD_LIMIT = 50000 ## CFG_WEBSEARCH_SYNONYM_KBRS -- defines which knowledge bases are to ## be used for which index in order to provide runtime synonym lookup ## of user-supplied terms, and what massaging function should be used ## upon search pattern before performing the KB lookup. (Can be one ## of `exact', 'leading_to_comma', `leading_to_number'.) CFG_WEBSEARCH_SYNONYM_KBRS = { 'journal': ['SEARCH-SYNONYM-JOURNAL', 'leading_to_number'], } ## CFG_SOLR_URL -- optionally, you may use Solr to serve full-text ## queries. If so, please specify the URL of your Solr instance. ## Example: http://localhost:8983/solr (default solr port) CFG_SOLR_URL = ## CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT -- specify the limit when ## the previous/next/back hit links are to be displayed on detailed record pages. ## In order to speeding up list manipulations, if a search returns lots of hits, ## more than this limit, then do not loose time calculating next/previous/back ## hits at all, but display page directly without these. ## Note also that Invenio installations that do not like ## to have the next/previous hit link functionality would be able to set this ## variable to zero and not see anything. CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT = 1000 ## CFG_WEBSEARCH_VIEWRESTRCOLL_POLICY -- when a record belongs to more than one ## restricted collection, if the viewrestcoll policy is set to "ALL" (default) ## then the user must be authorized to all the restricted collections, in ## order to be granted access to the specific record. If the policy is set to ## "ANY", then the user need to be authorized to only one of the collections ## in order to be granted access to the specific record. CFG_WEBSEARCH_VIEWRESTRCOLL_POLICY = ALL ## CFG_WEBSEARCH_SPIRES_SYNTAX -- variable to configure the use of the ## SPIRES query syntax in searches. Values: 0 = SPIRES syntax is ## switched off; 1 = leading 'find' is required; 9 = leading 'find' is ## not required (leading SPIRES operator, space-operator-space, etc ## are also accepted). CFG_WEBSEARCH_SPIRES_SYNTAX = 1 ## CFG_WEBSEARCH_DISPLAY_NEAREST_TERMS -- when user search does not ## return any direct result, what do we want to display? Set to 0 in ## order to display a generic message about search returning no hits. ## Set to 1 in order to display list of nearest terms from the indexes ## that may match user query. Note: this functionality may be slow, ## so you may want to disable it on bigger sites. CFG_WEBSEARCH_DISPLAY_NEAREST_TERMS = 1 ## CFG_WEBSEARCH_DETAILED_META_FORMAT -- the output format to use for ## detailed meta tags containing metadata as configured in the tag ## table. Default output format should be 'hdm', included. This ## format will be included in the header of /record/ pages. For ## efficiency this format should be pre-cached with BibReformat. See ## also CFG_WEBSEARCH_ENABLE_GOOGLESCHOLAR and ## CFG_WEBSEARCH_ENABLE_GOOGLESCHOLAR. CFG_WEBSEARCH_DETAILED_META_FORMAT = hdm ## CFG_WEBSEARCH_ENABLE_GOOGLESCHOLAR -- decides if meta tags for ## Google Scholar shall be included in the detailed record page ## header, when using the standard formatting templates/elements. See ## also CFG_WEBSEARCH_DETAILED_META_FORMAT and ## CFG_WEBSEARCH_ENABLE_OPENGRAPH. When this variable is changed and ## output format defined in CFG_WEBSEARCH_DETAILED_META_FORMAT is ## cached, a bibreformat must be run for the cached records. CFG_WEBSEARCH_ENABLE_GOOGLESCHOLAR = True ## CFG_WEBSEARCH_ENABLE_OPENGRAPH -- decides if meta tags for the Open ## Graph protocol shall be included in the detailed record page ## header, when using the standard formatting templates/elements. See ## also CFG_WEBSEARCH_DETAILED_META_FORMAT and ## CFG_WEBSEARCH_ENABLE_GOOGLESCHOLAR. When this variable is changed ## and output format defined in CFG_WEBSEARCH_DETAILED_META_FORMAT is ## cached, a bibreformat must be run for the cached records. Note that ## enabling Open Graph produces invalid XHTML/HTML5 markup. CFG_WEBSEARCH_ENABLE_OPENGRAPH = False ####################################### ## Part 4: BibHarvest OAI parameters ## ####################################### ## This part defines parameters for the Invenio OAI gateway. ## Useful if you are running Invenio as OAI data provider. ## CFG_OAI_ID_FIELD -- OAI identifier MARC field: CFG_OAI_ID_FIELD = 909COo ## CFG_OAI_SET_FIELD -- OAI set MARC field: CFG_OAI_SET_FIELD = 909COp ## CFG_OAI_SET_FIELD -- previous OAI set MARC field: CFG_OAI_PREVIOUS_SET_FIELD = 909COq ## CFG_OAI_DELETED_POLICY -- OAI deletedrecordspolicy ## (no/transient/persistent): CFG_OAI_DELETED_POLICY = persistent ## CFG_OAI_ID_PREFIX -- OAI identifier prefix: CFG_OAI_ID_PREFIX = atlantis.cern.ch ## CFG_OAI_SAMPLE_IDENTIFIER -- OAI sample identifier: CFG_OAI_SAMPLE_IDENTIFIER = oai:atlantis.cern.ch:123 ## CFG_OAI_IDENTIFY_DESCRIPTION -- description for the OAI Identify verb: CFG_OAI_IDENTIFY_DESCRIPTION = http://atlantis.cern.ch/ Free and unlimited use by anybody with obligation to refer to original record Full content, i.e. preprints may not be harvested by robots Submission restricted. Submitted documents are subject of approval by OAI repository admins. ## CFG_OAI_LOAD -- OAI number of records in a response: CFG_OAI_LOAD = 500 ## CFG_OAI_EXPIRE -- OAI resumptionToken expiration time: CFG_OAI_EXPIRE = 90000 ## CFG_OAI_SLEEP -- service unavailable between two consecutive ## requests for CFG_OAI_SLEEP seconds: CFG_OAI_SLEEP = 2 ## CFG_OAI_METADATA_FORMATS -- mapping between accepted metadataPrefixes and ## the corresponding output format to use, its schema and its metadataNamespace. CFG_OAI_METADATA_FORMATS = { 'marcxml': ('XOAIMARC', 'http://www.openarchives.org/OAI/1.1/dc.xsd', 'http://purl.org/dc/elements/1.1/'), 'oai_dc': ('XOAIDC', 'http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd', 'http://www.loc.gov/MARC21/slim'), } ## CFG_OAI_FRIENDS -- list of OAI baseURL of friend repositories. See: ## CFG_OAI_FRIENDS = http://cdsweb.cern.ch/oai2d,http://openaire.cern.ch/oai2d,http://export.arxiv.org/oai2 ## The following subfields are a completition to ## CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG. If CFG_OAI_PROVENANCE_BASEURL_SUBFIELD is ## set for a record, then the corresponding field is considered has being ## harvested via OAI-PMH ## CFG_OAI_PROVENANCE_BASEURL_SUBFIELD -- baseURL of the originDescription or a ## record CFG_OAI_PROVENANCE_BASEURL_SUBFIELD = u ## CFG_OAI_PROVENANCE_DATESTAMP_SUBFIELD -- datestamp of the originDescription ## or a record CFG_OAI_PROVENANCE_DATESTAMP_SUBFIELD = d ## CFG_OAI_PROVENANCE_METADATANAMESPACE_SUBFIELD -- metadataNamespace of the ## originDescription or a record CFG_OAI_PROVENANCE_METADATANAMESPACE_SUBFIELD = m ## CFG_OAI_PROVENANCE_ORIGINDESCRIPTION_SUBFIELD -- originDescription of the ## originDescription or a record CFG_OAI_PROVENANCE_ORIGINDESCRIPTION_SUBFIELD = d ## CFG_OAI_PROVENANCE_HARVESTDATE_SUBFIELD -- harvestDate of the ## originDescription or a record CFG_OAI_PROVENANCE_HARVESTDATE_SUBFIELD = h ## CFG_OAI_PROVENANCE_ALTERED_SUBFIELD -- altered flag of the ## originDescription or a record CFG_OAI_PROVENANCE_ALTERED_SUBFIELD = t ## CFG_OAI_FAILED_HARVESTING_STOP_QUEUE -- when harvesting OAI sources ## fails, shall we report an error with the task and stop BibSched ## queue, or simply wait for the next run of the task? A value of 0 ## will stop the task upon errors, 1 will let the queue run if the ## next run of the oaiharvest task can safely recover the failure ## (this means that the queue will stop if the task is not set to run ## periodically) CFG_OAI_FAILED_HARVESTING_STOP_QUEUE = 1 ## CFG_OAI_FAILED_HARVESTING_EMAILS_ADMIN -- when ## CFG_OAI_FAILED_HARVESTING_STOP_QUEUE is set to leave the queue ## running upon errors, shall we send an email to admin to notify ## about the failure? CFG_OAI_FAILED_HARVESTING_EMAILS_ADMIN = True ## NOTE: the following parameters are experimenta ## ----------------------------------------------------------------------------- ## CFG_OAI_RIGHTS_FIELD -- MARC field dedicated to storing Copyright information CFG_OAI_RIGHTS_FIELD = 542__ ## CFG_OAI_RIGHTS_HOLDER_SUBFIELD -- MARC subfield dedicated to storing the ## Copyright holder information CFG_OAI_RIGHTS_HOLDER_SUBFIELD = d ## CFG_OAI_RIGHTS_DATE_SUBFIELD -- MARC subfield dedicated to storing the ## Copyright date information CFG_OAI_RIGHTS_DATE_SUBFIELD = g ## CFG_OAI_RIGHTS_URI_SUBFIELD -- MARC subfield dedicated to storing the URI ## (URL or URN, more detailed statement about copyright status) information CFG_OAI_RIGHTS_URI_SUBFIELD = u ## CFG_OAI_RIGHTS_CONTACT_SUBFIELD -- MARC subfield dedicated to storing the ## Copyright holder contact information CFG_OAI_RIGHTS_CONTACT_SUBFIELD = e ## CFG_OAI_RIGHTS_STATEMENT_SUBFIELD -- MARC subfield dedicated to storing the ## Copyright statement as presented on the resource CFG_OAI_RIGHTS_STATEMENT_SUBFIELD = f ## CFG_OAI_LICENSE_FIELD -- MARC field dedicated to storing terms governing ## use and reproduction (license) CFG_OAI_LICENSE_FIELD = 540__ ## CFG_OAI_LICENSE_TERMS_SUBFIELD -- MARC subfield dedicated to storing the ## Terms governing use and reproduction, e.g. CC License CFG_OAI_LICENSE_TERMS_SUBFIELD = a ## CFG_OAI_LICENSE_PUBLISHER_SUBFIELD -- MARC subfield dedicated to storing the ## person or institution imposing the license (author, publisher) CFG_OAI_LICENSE_PUBLISHER_SUBFIELD = b ## CFG_OAI_LICENSE_URI_SUBFIELD -- MARC subfield dedicated to storing the URI ## URI CFG_OAI_LICENSE_URI_SUBFIELD = u ##------------------------------------------------------------------------------ ################################## ## Part 5: WebSubmit parameters ## ################################## ## This section contains some configuration parameters for WebSubmit ## module. Please note that WebSubmit is mostly configured on ## run-time via its WebSubmit Admin web interface. The parameters ## below are the ones that you do not probably want to modify during ## the runtime. ## CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT -- the fulltext ## documents are stored under "/opt/invenio/var/data/files/gX/Y" ## directories where X is 0,1,... and Y stands for bibdoc ID. Thusly ## documents Y are grouped into directories X and this variable ## indicates the maximum number of documents Y stored in each ## directory X. This limit is imposed solely for filesystem ## performance reasons in order not to have too many subdirectories in ## a given directory. CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT = 5000 ## CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS -- a comma-separated ## list of document extensions not listed in Python standard mimetype ## library that should be recognized by Invenio. CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS = hpg,link,lis,llb,mat,mpp,msg,docx,docm,xlsx,xlsm,xlsb,pptx,pptm,ppsx,ppsm ## CFG_WEBSUBMIT_DESIRED_CONVERSIONS -- a dictionary having as keys ## a format and as values the corresponding list of desired converted ## formats. CFG_WEBSUBMIT_DESIRED_CONVERSIONS = { 'pdf' : ('pdf;pdfa', ), 'ps.gz' : ('pdf;pdfa', ), 'djvu' : ('pdf', ), 'sxw': ('doc', 'odt', 'pdf;pdfa', ), 'docx' : ('doc', 'odt', 'pdf;pdfa', ), 'doc' : ('odt', 'pdf;pdfa', 'docx'), 'rtf' : ('pdf;pdfa', 'odt', ), 'odt' : ('pdf;pdfa', 'doc', ), 'pptx' : ('ppt', 'odp', 'pdf;pdfa', ), 'ppt' : ('odp', 'pdf;pdfa', 'pptx'), 'sxi': ('odp', 'pdf;pdfa', ), 'odp' : ('pdf;pdfa', 'ppt', ), 'xlsx' : ('xls', 'ods', 'csv'), 'xls' : ('ods', 'csv'), 'ods' : ('xls', 'xlsx', 'csv'), 'sxc': ('xls', 'xlsx', 'csv'), 'tiff' : ('pdf;pdfa', ), 'tif' : ('pdf;pdfa', ),} ## CFG_BIBDOCFILE_USE_XSENDFILE -- if your web server supports ## XSendfile header, you may want to enable this feature in order for ## to Invenio tell the web server to stream files for download (after ## proper authorization checks) by web server's means. This helps to ## liberate Invenio worker processes from being busy with sending big ## files to clients. The web server will take care of that. Note: ## this feature is still somewhat experimental. Note: when enabled ## (set to 1), then you have to also regenerate Apache vhost conf ## snippets (inveniocfg --update-config-py --create-apache-conf). CFG_BIBDOCFILE_USE_XSENDFILE = 0 ## CFG_BIBDOCFILE_MD5_CHECK_PROBABILITY -- a number between 0 and ## 1 that indicates probability with which MD5 checksum will be ## verified when streaming bibdocfile-managed files. (0.1 will cause ## the check to be performed once for every 10 downloads) CFG_BIBDOCFILE_MD5_CHECK_PROBABILITY = 0.1 ## CFG_BIBDOCFILE_BEST_FORMATS_TO_EXTRACT_TEXT_FROM -- a comma-separated ## list of document extensions in decrescent order of preference ## to suggest what is considered the best format to extract text from. CFG_BIBDOCFILE_BEST_FORMATS_TO_EXTRACT_TEXT_FROM = ('txt', 'html', 'xml', 'odt', 'doc', 'docx', 'djvu', 'pdf', 'ps', 'ps.gz') +## CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE -- whether to use the +## database table bibdocfsinfo as reference for filesystem +## information. The default is 0. Switch this to 1 +## after you have run bibdocfile --fix-bibdocfsinfo-cache +## or on an empty system. +CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE = 0 + ## CFG_OPENOFFICE_SERVER_HOST -- the host where an OpenOffice Server is ## listening to. If localhost an OpenOffice server will be started ## automatically if it is not already running. ## Note: if you set this to an empty value this will disable the usage of ## OpenOffice for converting documents. ## If you set this to something different than localhost you'll have to take ## care to have an OpenOffice server running on the corresponding host and ## to install the same OpenOffice release both on the client and on the server ## side. ## In order to launch an OpenOffice server on a remote machine, just start ## the usual 'soffice' executable in this way: ## $> soffice -headless -nologo -nodefault -norestore -nofirststartwizard \ ## .. -accept=socket,host=HOST,port=PORT;urp;StarOffice.ComponentContext CFG_OPENOFFICE_SERVER_HOST = localhost ## CFG_OPENOFFICE_SERVER_PORT -- the port where an OpenOffice Server is ## listening to. CFG_OPENOFFICE_SERVER_PORT = 2002 ## CFG_OPENOFFICE_USER -- the user that will be used to launch the OpenOffice ## client. It is recommended to set this to a user who don't own files, like ## e.g. 'nobody'. You should also authorize your Apache server user to be ## able to become this user, e.g. by adding to your /etc/sudoers the following ## line: ## "apache ALL=(nobody) NOPASSWD: ALL" ## provided that apache is the username corresponding to the Apache user. ## On some machine this might be apache2 or www-data. CFG_OPENOFFICE_USER = nobody ################################# ## Part 6: BibIndex parameters ## ################################# ## This section contains some configuration parameters for BibIndex ## module. Please note that BibIndex is mostly configured on run-time ## via its BibIndex Admin web interface. The parameters below are the ## ones that you do not probably want to modify very often during the ## runtime. ## CFG_BIBINDEX_FULLTEXT_INDEX_LOCAL_FILES_ONLY -- when fulltext indexing, do ## you want to index locally stored files only, or also external URLs? ## Use "0" to say "no" and "1" to say "yes". CFG_BIBINDEX_FULLTEXT_INDEX_LOCAL_FILES_ONLY = 1 ## CFG_BIBINDEX_REMOVE_STOPWORDS -- when indexing, do we want to remove ## stopwords? Use "0" to say "no" and "1" to say "yes". CFG_BIBINDEX_REMOVE_STOPWORDS = 0 ## CFG_BIBINDEX_CHARS_ALPHANUMERIC_SEPARATORS -- characters considered as ## alphanumeric separators of word-blocks inside words. You probably ## don't want to change this. CFG_BIBINDEX_CHARS_ALPHANUMERIC_SEPARATORS = \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ ## CFG_BIBINDEX_CHARS_PUNCTUATION -- characters considered as punctuation ## between word-blocks inside words. You probably don't want to ## change this. CFG_BIBINDEX_CHARS_PUNCTUATION = \.\,\:\;\?\!\" ## CFG_BIBINDEX_REMOVE_HTML_MARKUP -- should we attempt to remove HTML markup ## before indexing? Use 1 if you have HTML markup inside metadata ## (e.g. in abstracts), use 0 otherwise. CFG_BIBINDEX_REMOVE_HTML_MARKUP = 0 ## CFG_BIBINDEX_REMOVE_LATEX_MARKUP -- should we attempt to remove LATEX markup ## before indexing? Use 1 if you have LATEX markup inside metadata ## (e.g. in abstracts), use 0 otherwise. CFG_BIBINDEX_REMOVE_LATEX_MARKUP = 0 ## CFG_BIBINDEX_MIN_WORD_LENGTH -- minimum word length allowed to be added to ## index. The terms smaller then this amount will be discarded. ## Useful to keep the database clean, however you can safely leave ## this value on 0 for up to 1,000,000 documents. CFG_BIBINDEX_MIN_WORD_LENGTH = 0 ## CFG_BIBINDEX_URLOPENER_USERNAME and CFG_BIBINDEX_URLOPENER_PASSWORD -- ## access credentials to access restricted URLs, interesting only if ## you are fulltext-indexing files located on a remote server that is ## only available via username/password. But it's probably better to ## handle this case via IP or some convention; the current scheme is ## mostly there for demo only. CFG_BIBINDEX_URLOPENER_USERNAME = mysuperuser CFG_BIBINDEX_URLOPENER_PASSWORD = mysuperpass ## CFG_INTBITSET_ENABLE_SANITY_CHECKS -- ## Enable sanity checks for integers passed to the intbitset data ## structures. It is good to enable this during debugging ## and to disable this value for speed improvements. CFG_INTBITSET_ENABLE_SANITY_CHECKS = False ## CFG_BIBINDEX_PERFORM_OCR_ON_DOCNAMES -- regular expression that matches ## docnames for which OCR is desired (set this to .* in order to enable ## OCR in general, set this to empty in order to disable it.) CFG_BIBINDEX_PERFORM_OCR_ON_DOCNAMES = scan-.* ## CFG_BIBINDEX_SPLASH_PAGES -- key-value mapping where the key corresponds ## to a regular expression that matches the URLs of the splash pages of ## a given service and the value is a regular expression of the set of URLs ## referenced via tags in the HTML content of the splash pages that are ## referring to documents that need to be indexed. ## NOTE: for backward compatibility reasons you can set this to a simple ## regular expression that will directly be used as the unique key of the ## map, with corresponding value set to ".*" (in order to match any URL) CFG_BIBINDEX_SPLASH_PAGES = { "http://documents\.cern\.ch/setlink\?.*": ".*", "http://ilcagenda\.linearcollider\.org/subContributionDisplay\.py\?.*|http://ilcagenda\.linearcollider\.org/contributionDisplay\.py\?.*": "http://ilcagenda\.linearcollider\.org/getFile\.py/access\?.*|http://ilcagenda\.linearcollider\.org/materialDisplay\.py\?.*", } ## CFG_BIBINDEX_AUTHOR_WORD_INDEX_EXCLUDE_FIRST_NAMES -- do we want ## the author word index to exclude first names to keep only last ## names? If set to True, then for the author `Bernard, Denis', only ## `Bernard' will be indexed in the word index, not `Denis'. Note ## that if you change this variable, you have to re-index the author ## index via `bibindex -w author -R'. CFG_BIBINDEX_AUTHOR_WORD_INDEX_EXCLUDE_FIRST_NAMES = False ## CFG_BIBINDEX_SYNONYM_KBRS -- defines which knowledge bases are to ## be used for which index in order to provide index-time synonym ## lookup, and what massaging function should be used upon search ## pattern before performing the KB lookup. (Can be one of `exact', ## 'leading_to_comma', `leading_to_number'.) CFG_BIBINDEX_SYNONYM_KBRS = { 'global': ['INDEX-SYNONYM-TITLE', 'exact'], 'title': ['INDEX-SYNONYM-TITLE', 'exact'], } ####################################### ## Part 7: Access control parameters ## ####################################### ## This section contains some configuration parameters for the access ## control system. Please note that WebAccess is mostly configured on ## run-time via its WebAccess Admin web interface. The parameters ## below are the ones that you do not probably want to modify very ## often during the runtime. (If you do want to modify them during ## runtime, for example te deny access temporarily because of backups, ## you can edit access_control_config.py directly, no need to get back ## here and no need to redo the make process.) ## CFG_ACCESS_CONTROL_LEVEL_SITE -- defines how open this site is. ## Use 0 for normal operation of the site, 1 for read-only site (all ## write operations temporarily closed), 2 for site fully closed, ## 3 for also disabling any database connection. ## Useful for site maintenance. CFG_ACCESS_CONTROL_LEVEL_SITE = 0 ## CFG_ACCESS_CONTROL_LEVEL_GUESTS -- guest users access policy. Use ## 0 to allow guest users, 1 not to allow them (all users must login). CFG_ACCESS_CONTROL_LEVEL_GUESTS = 0 ## CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS -- account registration and ## activation policy. When 0, users can register and accounts are ## automatically activated. When 1, users can register but admin must ## activate the accounts. When 2, users cannot register nor update ## their email address, only admin can register accounts. When 3, ## users cannot register nor update email address nor password, only ## admin can register accounts. When 4, the same as 3 applies, nor ## user cannot change his login method. When 5, then the same as 4 ## applies, plus info about how to get an account is hidden from the ## login page. CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS = 0 ## CFG_ACCESS_CONTROL_LIMIT_REGISTRATION_TO_DOMAIN -- limit account ## registration to certain email addresses? If wanted, give domain ## name below, e.g. "cern.ch". If not wanted, leave it empty. CFG_ACCESS_CONTROL_LIMIT_REGISTRATION_TO_DOMAIN = ## CFG_ACCESS_CONTROL_NOTIFY_ADMIN_ABOUT_NEW_ACCOUNTS -- send a ## notification email to the administrator when a new account is ## created? Use 0 for no, 1 for yes. CFG_ACCESS_CONTROL_NOTIFY_ADMIN_ABOUT_NEW_ACCOUNTS = 0 ## CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT -- send a ## notification email to the user when a new account is created in order to ## to verify the validity of the provided email address? Use ## 0 for no, 1 for yes. CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT = 1 ## CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_ACTIVATION -- send a ## notification email to the user when a new account is activated? ## Use 0 for no, 1 for yes. CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_ACTIVATION = 0 ## CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_DELETION -- send a ## notification email to the user when a new account is deleted or ## account demand rejected? Use 0 for no, 1 for yes. CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_DELETION = 0 ## CFG_APACHE_PASSWORD_FILE -- the file where Apache user credentials ## are stored. Must be an absolute pathname. If the value does not ## start by a slash, it is considered to be the filename of a file ## located under prefix/var/tmp directory. This is useful for the ## demo site testing purposes. For the production site, if you plan ## to restrict access to some collections based on the Apache user ## authentication mechanism, you should put here an absolute path to ## your Apache password file. CFG_APACHE_PASSWORD_FILE = demo-site-apache-user-passwords ## CFG_APACHE_GROUP_FILE -- the file where Apache user groups are ## defined. See the documentation of the preceding config variable. CFG_APACHE_GROUP_FILE = demo-site-apache-user-groups ################################### ## Part 8: WebSession parameters ## ################################### ## This section contains some configuration parameters for tweaking ## session handling. ## CFG_WEBSESSION_EXPIRY_LIMIT_DEFAULT -- number of days after which a session ## and the corresponding cookie is considered expired. CFG_WEBSESSION_EXPIRY_LIMIT_DEFAULT = 2 ## CFG_WEBSESSION_EXPIRY_LIMIT_REMEMBER -- number of days after which a session ## and the corresponding cookie is considered expired, when the user has ## requested to permanently stay logged in. CFG_WEBSESSION_EXPIRY_LIMIT_REMEMBER = 365 ## CFG_WEBSESSION_RESET_PASSWORD_EXPIRE_IN_DAYS -- when user requested ## a password reset, for how many days is the URL valid? CFG_WEBSESSION_RESET_PASSWORD_EXPIRE_IN_DAYS = 3 ## CFG_WEBSESSION_ADDRESS_ACTIVATION_EXPIRE_IN_DAYS -- when an account ## activation email was sent, for how many days is the URL valid? CFG_WEBSESSION_ADDRESS_ACTIVATION_EXPIRE_IN_DAYS = 3 ## CFG_WEBSESSION_NOT_CONFIRMED_EMAIL_ADDRESS_EXPIRE_IN_DAYS -- when ## user won't confirm his email address and not complete ## registeration, after how many days will it expire? CFG_WEBSESSION_NOT_CONFIRMED_EMAIL_ADDRESS_EXPIRE_IN_DAYS = 10 ## CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS -- when set to 1, the session ## system allocates the same uid=0 to all guests users regardless of where they ## come from. 0 allocate a unique uid to each guest. CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS = 0 ## CFG_WEBSESSION_IPADDR_CHECK_SKIP_BITS -- to prevent session cookie ## stealing, Invenio checks that the IP address of a connection is the ## same as that of the connection which created the initial session. ## This variable let you decide how many bits should be skipped during ## this check. Set this to 0 in order to enable full IP address ## checking. Set this to 32 in order to disable IP address checking. ## Intermediate values (say 8) let you have some degree of security so ## that you can trust your local network only while helping to solve ## issues related to outside clients that configured their browser to ## use a web proxy for HTTP connection but not for HTTPS, thus ## potentially having two different IP addresses. In general, if use ## HTTPS in order to serve authenticated content, you can safely set ## CFG_WEBSESSION_IPADDR_CHECK_SKIP_BITS to 32. CFG_WEBSESSION_IPADDR_CHECK_SKIP_BITS = 0 ################################ ## Part 9: BibRank parameters ## ################################ ## This section contains some configuration parameters for the ranking ## system. ## CFG_BIBRANK_SHOW_READING_STATS -- do we want to show reading ## similarity stats? ('People who viewed this page also viewed') CFG_BIBRANK_SHOW_READING_STATS = 1 ## CFG_BIBRANK_SHOW_DOWNLOAD_STATS -- do we want to show the download ## similarity stats? ('People who downloaded this document also ## downloaded') CFG_BIBRANK_SHOW_DOWNLOAD_STATS = 1 ## CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS -- do we want to show download ## history graph? (0=no | 1=classic/gnuplot | 2=flot) CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS = 1 ## CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS_CLIENT_IP_DISTRIBUTION -- do we ## want to show a graph representing the distribution of client IPs ## downloading given document? (0=no | 1=classic/gnuplot | 2=flot) CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS_CLIENT_IP_DISTRIBUTION = 0 ## CFG_BIBRANK_SHOW_CITATION_LINKS -- do we want to show the 'Cited ## by' links? (useful only when you have citations in the metadata) CFG_BIBRANK_SHOW_CITATION_LINKS = 1 ## CFG_BIBRANK_SHOW_CITATION_STATS -- de we want to show citation ## stats? ('Cited by M recors', 'Co-cited with N records') CFG_BIBRANK_SHOW_CITATION_STATS = 1 ## CFG_BIBRANK_SHOW_CITATION_GRAPHS -- do we want to show citation ## history graph? (0=no | 1=classic/gnuplot | 2=flot) CFG_BIBRANK_SHOW_CITATION_GRAPHS = 1 #################################### ## Part 10: WebComment parameters ## #################################### ## This section contains some configuration parameters for the ## commenting and reviewing facilities. ## CFG_WEBCOMMENT_ALLOW_COMMENTS -- do we want to allow users write ## public comments on records? CFG_WEBCOMMENT_ALLOW_COMMENTS = 1 ## CFG_WEBCOMMENT_ALLOW_REVIEWS -- do we want to allow users write ## public reviews of records? CFG_WEBCOMMENT_ALLOW_REVIEWS = 1 ## CFG_WEBCOMMENT_ALLOW_SHORT_REVIEWS -- do we want to allow short ## reviews, that is just the attribution of stars without submitting ## detailed review text? CFG_WEBCOMMENT_ALLOW_SHORT_REVIEWS = 0 ## CFG_WEBCOMMENT_NB_REPORTS_BEFORE_SEND_EMAIL_TO_ADMIN -- if users ## report a comment to be abusive, how many they have to be before the ## site admin is alerted? CFG_WEBCOMMENT_NB_REPORTS_BEFORE_SEND_EMAIL_TO_ADMIN = 5 ## CFG_WEBCOMMENT_NB_COMMENTS_IN_DETAILED_VIEW -- how many comments do ## we display in the detailed record page upon welcome? CFG_WEBCOMMENT_NB_COMMENTS_IN_DETAILED_VIEW = 1 ## CFG_WEBCOMMENT_NB_REVIEWS_IN_DETAILED_VIEW -- how many reviews do ## we display in the detailed record page upon welcome? CFG_WEBCOMMENT_NB_REVIEWS_IN_DETAILED_VIEW = 1 ## CFG_WEBCOMMENT_ADMIN_NOTIFICATION_LEVEL -- do we notify the site ## admin after every comment? CFG_WEBCOMMENT_ADMIN_NOTIFICATION_LEVEL = 1 ## CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_COMMENTS_IN_SECONDS -- how many ## elapsed seconds do we consider enough when checking for possible ## multiple comment submissions by a user? CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_COMMENTS_IN_SECONDS = 20 ## CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_REVIEWS_IN_SECONDS -- how many ## elapsed seconds do we consider enough when checking for possible ## multiple review submissions by a user? CFG_WEBCOMMENT_TIMELIMIT_PROCESSING_REVIEWS_IN_SECONDS = 20 ## CFG_WEBCOMMENT_USE_RICH_EDITOR -- enable the WYSIWYG ## Javascript-based editor when user edits comments? CFG_WEBCOMMENT_USE_RICH_TEXT_EDITOR = False ## CFG_WEBCOMMENT_ALERT_ENGINE_EMAIL -- the email address from which the ## alert emails will appear to be sent: CFG_WEBCOMMENT_ALERT_ENGINE_EMAIL = info@invenio-software.org ## CFG_WEBCOMMENT_DEFAULT_MODERATOR -- if no rules are ## specified to indicate who is the comment moderator of ## a collection, this person will be used as default CFG_WEBCOMMENT_DEFAULT_MODERATOR = info@invenio-software.org ## CFG_WEBCOMMENT_USE_MATHJAX_IN_COMMENTS -- do we want to allow the use ## of MathJax plugin to render latex input in comments? CFG_WEBCOMMENT_USE_MATHJAX_IN_COMMENTS = 1 ## CFG_WEBCOMMENT_AUTHOR_DELETE_COMMENT_OPTION -- allow comment author to ## delete its own comment? CFG_WEBCOMMENT_AUTHOR_DELETE_COMMENT_OPTION = 1 # CFG_WEBCOMMENT_EMAIL_REPLIES_TO -- which field of the record define # email addresses that should be notified of newly submitted comments, # and for which collection. Use collection names as keys, and list of # tags as values CFG_WEBCOMMENT_EMAIL_REPLIES_TO = { 'Articles': ['506__d', '506__m'], } # CFG_WEBCOMMENT_RESTRICTION_DATAFIELD -- which field of the record # define the restriction (must be linked to WebAccess # 'viewrestrcomment') to apply to newly submitted comments, and for # which collection. Use collection names as keys, and one tag as value CFG_WEBCOMMENT_RESTRICTION_DATAFIELD = { 'Articles': '5061_a', 'Pictures': '5061_a', 'Theses': '5061_a', } # CFG_WEBCOMMENT_ROUND_DATAFIELD -- which field of the record define # the current round of comment for which collection. Use collection # name as key, and one tag as value CFG_WEBCOMMENT_ROUND_DATAFIELD = { 'Articles': '562__c', 'Pictures': '562__c', } # CFG_WEBCOMMENT_MAX_ATTACHMENT_SIZE -- max file size per attached # file, in bytes. Choose 0 if you don't want to limit the size CFG_WEBCOMMENT_MAX_ATTACHMENT_SIZE = 5242880 # CFG_WEBCOMMENT_MAX_ATTACHED_FILES -- maxium number of files that can # be attached per comment. Choose 0 if you don't want to limit the # number of files. File uploads can be restricted with action # "attachcommentfile". CFG_WEBCOMMENT_MAX_ATTACHED_FILES = 5 # CFG_WEBCOMMENT_MAX_COMMENT_THREAD_DEPTH -- how many levels of # indentation discussions can be. This can be used to ensure that # discussions will not go into deep levels of nesting if users don't # understand the difference between "reply to comment" and "add # comment". When the depth is reached, any "reply to comment" is # conceptually converted to a "reply to thread" (i.e. reply to this # parent's comment). Use -1 for no limit, 0 for unthreaded (flat) # discussions. CFG_WEBCOMMENT_MAX_COMMENT_THREAD_DEPTH = 1 ################################## ## Part 11: BibSched parameters ## ################################## ## This section contains some configuration parameters for the ## bibliographic task scheduler. ## CFG_BIBSCHED_REFRESHTIME -- how often do we want to refresh ## bibsched monitor? (in seconds) CFG_BIBSCHED_REFRESHTIME = 5 ## CFG_BIBSCHED_LOG_PAGER -- what pager to use to view bibsched task ## logs? CFG_BIBSCHED_LOG_PAGER = /usr/bin/less ## CFG_BIBSCHED_EDITOR -- what editor to use to edit the marcxml ## code of the locked records CFG_BIBSCHED_EDITOR = /usr/bin/vim ## CFG_BIBSCHED_GC_TASKS_OLDER_THAN -- after how many days to perform the ## gargbage collector of BibSched queue (i.e. removing/moving task to archive). CFG_BIBSCHED_GC_TASKS_OLDER_THAN = 30 ## CFG_BIBSCHED_GC_TASKS_TO_REMOVE -- list of BibTask that can be safely ## removed from the BibSched queue once they are DONE. CFG_BIBSCHED_GC_TASKS_TO_REMOVE = bibindex,bibreformat,webcoll,bibrank,inveniogc ## CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE -- list of BibTasks that should be safely ## archived out of the BibSched queue once they are DONE. CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE = bibupload,oaiarchive ## CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS -- maximum number of BibTasks ## that can run concurrently. ## NOTE: concurrent tasks are still considered as an experimental ## feature. Please keep this value set to 1 on production environments. CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS = 1 ## CFG_BIBSCHED_PROCESS_USER -- bibsched and bibtask processes must ## usually run under the same identity as the Apache web server ## process in order to share proper file read/write privileges. If ## you want to force some other bibsched/bibtask user, e.g. because ## you are using a local `invenio' user that belongs to your ## `www-data' Apache user group and so shares writing rights with your ## Apache web server process in this way, then please set its username ## identity here. Otherwise we shall check whether your ## bibsched/bibtask processes are run under the same identity as your ## Apache web server process (in which case you can leave the default ## empty value here). CFG_BIBSCHED_PROCESS_USER = ## CFG_BIBSCHED_NODE_TASKS -- specific nodes may be configured to ## run only specific tasks; if you want this, then this variable is a ## dictionary of the form {'hostname1': ['task1', 'task2']}. The ## default is that any node can run any task. CFG_BIBSCHED_NODE_TASKS = {} ## CFG_BIBSCHED_MAX_ARCHIVED_ROWS_DISPLAY -- number of tasks displayed ## CFG_BIBSCHED_MAX_ARCHIVED_ROWS_DISPLAY = 500 ################################### ## Part 12: WebBasket parameters ## ################################### ## CFG_WEBBASKET_MAX_NUMBER_OF_DISPLAYED_BASKETS -- a safety limit for ## a maximum number of displayed baskets CFG_WEBBASKET_MAX_NUMBER_OF_DISPLAYED_BASKETS = 20 ## CFG_WEBBASKET_USE_RICH_TEXT_EDITOR -- enable the WYSIWYG ## Javascript-based editor when user edits comments in WebBasket? CFG_WEBBASKET_USE_RICH_TEXT_EDITOR = False ################################## ## Part 13: WebAlert parameters ## ################################## ## This section contains some configuration parameters for the ## automatic email notification alert system. ## CFG_WEBALERT_ALERT_ENGINE_EMAIL -- the email address from which the ## alert emails will appear to be sent: CFG_WEBALERT_ALERT_ENGINE_EMAIL = info@invenio-software.org ## CFG_WEBALERT_MAX_NUM_OF_RECORDS_IN_ALERT_EMAIL -- how many records ## at most do we send in an outgoing alert email? CFG_WEBALERT_MAX_NUM_OF_RECORDS_IN_ALERT_EMAIL = 20 ## CFG_WEBALERT_MAX_NUM_OF_CHARS_PER_LINE_IN_ALERT_EMAIL -- number of ## chars per line in an outgoing alert email? CFG_WEBALERT_MAX_NUM_OF_CHARS_PER_LINE_IN_ALERT_EMAIL = 72 ## CFG_WEBALERT_SEND_EMAIL_NUMBER_OF_TRIES -- when sending alert ## emails fails, how many times we retry? CFG_WEBALERT_SEND_EMAIL_NUMBER_OF_TRIES = 3 ## CFG_WEBALERT_SEND_EMAIL_SLEEPTIME_BETWEEN_TRIES -- when sending ## alert emails fails, what is the sleeptime between tries? (in ## seconds) CFG_WEBALERT_SEND_EMAIL_SLEEPTIME_BETWEEN_TRIES = 300 #################################### ## Part 14: WebMessage parameters ## #################################### ## CFG_WEBMESSAGE_MAX_SIZE_OF_MESSAGE -- how large web messages do we ## allow? CFG_WEBMESSAGE_MAX_SIZE_OF_MESSAGE = 20000 ## CFG_WEBMESSAGE_MAX_NB_OF_MESSAGES -- how many messages for a ## regular user do we allow in its inbox? CFG_WEBMESSAGE_MAX_NB_OF_MESSAGES = 30 ## CFG_WEBMESSAGE_DAYS_BEFORE_DELETE_ORPHANS -- how many days before ## we delete orphaned messages? CFG_WEBMESSAGE_DAYS_BEFORE_DELETE_ORPHANS = 60 ################################## ## Part 15: MiscUtil parameters ## ################################## ## CFG_MISCUTIL_SQL_USE_SQLALCHEMY -- whether to use SQLAlchemy.pool ## in the DB engine of Invenio. It is okay to enable this flag ## even if you have not installed SQLAlchemy. Note that Invenio will ## loose some perfomance if this option is enabled. CFG_MISCUTIL_SQL_USE_SQLALCHEMY = False ## CFG_MISCUTIL_SQL_RUN_SQL_MANY_LIMIT -- how many queries can we run ## inside run_sql_many() in one SQL statement? The limit value ## depends on MySQL's max_allowed_packet configuration. CFG_MISCUTIL_SQL_RUN_SQL_MANY_LIMIT = 10000 ## CFG_MISCUTIL_SMTP_HOST -- which server to use as outgoing mail server to ## send outgoing emails generated by the system, for example concerning ## submissions or email notification alerts. CFG_MISCUTIL_SMTP_HOST = localhost ## CFG_MISCUTIL_SMTP_PORT -- which port to use on the outgoing mail server ## defined in the previous step. CFG_MISCUTIL_SMTP_PORT = 25 ## CFG_MISCUTILS_DEFAULT_PROCESS_TIMEOUT -- the default number of seconds after ## which a process launched trough shellutils.run_process_with_timeout will ## be killed. This is useful to catch runaway processes. CFG_MISCUTIL_DEFAULT_PROCESS_TIMEOUT = 300 ## CFG_MATHJAX_HOSTING -- if you plan to use MathJax to display TeX ## formulas on HTML web pages, you can specify whether you wish to use ## 'local' hosting or 'cdn' hosting of MathJax libraries. (If set to ## 'local', you have to run 'make install-mathjax-plugin' as described ## in the INSTALL guide.) If set to 'local', users will use your site ## to download MathJax sources. If set to 'cdn', users will use ## centralized MathJax CDN servers instead. Please note that using ## CDN is suitable only for small institutions or for MathJax ## sponsors; see the MathJax website for more details. (Also, please ## note that if you plan to use MathJax on your site, you have to ## adapt CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS and ## CFG_WEBCOMMENT_USE_MATHJAX_IN_COMMENTS configuration variables ## elsewhere in this file.) CFG_MATHJAX_HOSTING = local ################################# ## Part 16: BibEdit parameters ## ################################# ## CFG_BIBEDIT_TIMEOUT -- when a user edits a record, this record is ## locked to prevent other users to edit it at the same time. ## How many seconds of inactivity before the locked record again will be free ## for other people to edit? CFG_BIBEDIT_TIMEOUT = 3600 ## CFG_BIBEDIT_LOCKLEVEL -- when a user tries to edit a record which there ## is a pending bibupload task for in the queue, this shouldn't be permitted. ## The lock level determines how thouroughly the queue should be investigated ## to determine if this is the case. ## Level 0 - always permits editing, doesn't look at the queue ## (unsafe, use only if you know what you are doing) ## Level 1 - permits editing if there are no queued bibedit tasks for this record ## (safe with respect to bibedit, but not for other bibupload maintenance jobs) ## Level 2 - permits editing if there are no queued bibupload tasks of any sort ## (safe, but may lock more than necessary if many cataloguers around) ## Level 3 - permits editing if no queued bibupload task concerns given record ## (safe, most precise locking, but slow, ## checks for 001/EXTERNAL_SYSNO_TAG/EXTERNAL_OAIID_TAG) ## The recommended level is 3 (default) or 2 (if you use maintenance jobs often). CFG_BIBEDIT_LOCKLEVEL = 3 ## CFG_BIBEDIT_PROTECTED_FIELDS -- a comma-separated list of fields that BibEdit ## will not allow to be added, edited or deleted. Wildcards are not supported, ## but conceptually a wildcard is added at the end of every field specification. ## Examples: ## 500A - protect all MARC fields with tag 500 and first indicator A ## 5 - protect all MARC fields in the 500-series. ## 909C_a - protect subfield a in tag 909 with first indicator C and empty ## second indicator ## Note that 001 is protected by default, but if protection of other ## identifiers or automated fields is a requirement, they should be added to ## this list. CFG_BIBEDIT_PROTECTED_FIELDS = ## CFG_BIBEDIT_QUEUE_CHECK_METHOD -- how do we want to check for ## possible queue locking situations to prevent cataloguers from ## editing a record that may be waiting in the queue? Use 'bibrecord' ## for exact checking (always works, but may be slow), use 'regexp' ## for regular expression based checking (very fast, but may be ## inaccurate). When unsure, use 'bibrecord'. CFG_BIBEDIT_QUEUE_CHECK_METHOD = bibrecord ## CFG_BIBEDIT_EXTEND_RECORD_WITH_COLLECTION_TEMPLATE -- a dictionary ## containing which collections will be extended with a given template ## while being displayed in BibEdit UI. CFG_BIBEDIT_EXTEND_RECORD_WITH_COLLECTION_TEMPLATE = { 'Poetry' : 'poem'} ## CFG_BIBEDIT_KB_SUBJECTS - Name of the KB used in the field 65017a ## to automatically convert codes into extended version. e.g ## a - Astrophysics CFG_BIBEDIT_KB_SUBJECTS = Subjects ## CFG_BIBEDIT_KB_INSTITUTIONS - Name of the KB used for institution ## autocomplete. To be applied in fields defined in ## CFG_BIBEDIT_AUTOCOMPLETE_INSTITUTIONS_FIELDS CFG_BIBEDIT_KB_INSTITUTIONS = InstitutionsCollection ## CFG_BIBEDIT_AUTOCOMPLETE_INSTITUTIONS_FIELDS - list of fields to ## be autocompleted with the KB CFG_BIBEDIT_KB_INSTITUTIONS CFG_BIBEDIT_AUTOCOMPLETE_INSTITUTIONS_FIELDS = 100__u,700__u,701__u,502__c ## CFG_BIBEDITMULTI_LIMIT_INSTANT_PROCESSING -- maximum number of records ## that can be modified instantly using the multi-record editor. Above ## this limit, modifications will only be executed in limited hours. CFG_BIBEDITMULTI_LIMIT_INSTANT_PROCESSING = 2000 ## CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING -- maximum number of records ## that can be send for modification without having a superadmin role. ## If the number of records is between CFG_BIBEDITMULTI_LIMIT_INSTANT_PROCESSING ## and this number, the modifications will take place only in limited hours. CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING = 20000 ## CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING_TIME -- Allowed time to ## execute modifications on records, when the number exceeds ## CFG_BIBEDITMULTI_LIMIT_INSTANT_PROCESSING. CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING_TIME = 22:00-05:00 ################################### ## Part 17: BibUpload parameters ## ################################### ## CFG_BIBUPLOAD_REFERENCE_TAG -- where do we store references? CFG_BIBUPLOAD_REFERENCE_TAG = 999 ## CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG -- where do we store external ## system numbers? Useful for matching when our records come from an ## external digital library system. CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG = 970__a ## CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG -- where do we store OAI ID tags ## of harvested records? Useful for matching when we harvest stuff ## via OAI that we do not want to reexport via Invenio OAI; so records ## may have only the source OAI ID stored in this tag (kind of like ## external system number too). CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG = 035__a ## CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG -- where do we store OAI SRC ## tags of harvested records? Useful for matching when we harvest stuff ## via OAI that we do not want to reexport via Invenio OAI; so records ## may have only the source OAI SRC stored in this tag (kind of like ## external system number too). Note that the field should be the same of ## CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG. CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG = 035__9 ## CFG_BIBUPLOAD_STRONG_TAGS -- a comma-separated list of tags that ## are strong enough to resist the replace mode. Useful for tags that ## might be created from an external non-metadata-like source, ## e.g. the information about the number of copies left. CFG_BIBUPLOAD_STRONG_TAGS = 964 ## CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS -- a comma-separated list ## of tags that contain provenance information that should be checked ## in the bibupload correct mode via matching provenance codes. (Only ## field instances of the same provenance information would be acted ## upon.) Please specify the whole tag info up to subfield codes. CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS = 6531_9 ## CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS -- a comma-separated list of system ## paths from which it is allowed to take fulltextes that will be uploaded via ## FFT (CFG_TMPDIR is included by default). CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS = /tmp,/home ## CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS -- a dictionary containing external ## URLs that can be accessed by Invenio and specific HTTP headers that will be ## used for each URL. ## The keys of the dictionary are regular expressions matching a set of URLs, ## the values are dictionaries of headers as consumed by urllib2.Request. If a ## regular expression matching all URLs is created at the end of the list, it ## means that Invenio will download all URLs. Otherwise Invenio will just ## download authorized URLs. ## CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS = [ ## ('http://myurl.com/.*', {'User-Agent': 'Me'}), ## ('http://yoururl.com/.*', {'User-Agent': 'You', 'Accept': 'text/plain'}), ## ('http://.*', {'User-Agent': 'Invenio'}), ## ] CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS = [ ('http(s)?://.*', {'User-Agent': 'Invenio'}), ] ## CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE -- do we want to serialize ## internal representation of records (Pythonic record structure) into ## the database? This can improve internal processing speed of some ## operations at the price of somewhat bigger disk space usage. ## If you change this value after some records have already been added ## to your installation, you may want to run: ## $ /opt/invenio/bin/inveniocfg --reset-recstruct-cache ## in order to either erase the cache thus freeing database space, ## or to fill the cache for all records that have not been cached yet. CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE = 1 ## CFG_BIBUPLOAD_DELETE_FORMATS -- which formats do we want bibupload ## to delete when a record is ingested? Enter comma-separated list of ## formats. For example, 'hb,hd' will delete pre-formatted HTML brief ## and defailed formats from cache, so that search engine will ## generate them on-the-fly. Useful to always present latest data of ## records upon record display, until the periodical bibreformat job ## runs next and updates the cache. CFG_BIBUPLOAD_DELETE_FORMATS = hb ## CFG_BATCHUPLOADER_FILENAME_MATCHING_POLICY -- a comma-separated list ## indicating which fields match the file names of the documents to be ## uploaded. ## The matching will be done in the same order as the list provided. CFG_BATCHUPLOADER_FILENAME_MATCHING_POLICY = reportnumber,recid ## CFG_BATCHUPLOADER_DAEMON_DIR -- Directory where the batchuploader daemon ## will look for the subfolders metadata and document by default. ## If path is relative, CFG_PREFIX will be joined as a prefix CFG_BATCHUPLOADER_DAEMON_DIR = var/batchupload ## CFG_BATCHUPLOADER_WEB_ROBOT_AGENT -- Comma-separated list to specify the ## agents permitted when calling batch uploader web interface ## cdsweb.cern.ch/batchuploader/robotupload ## if using a curl, eg: curl xxx -A invenio_webupload CFG_BATCHUPLOADER_WEB_ROBOT_AGENT = invenio_webupload ## CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS -- Access list specifying for each ## IP address, which collections are allowed using batch uploader robot ## interface. CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS = { '10.0.0.1': ['BOOK', 'REPORT'], # Example 1 '10.0.0.2': ['POETRY', 'PREPRINT'], # Example 2 } #################################### ## Part 18: BibCatalog parameters ## #################################### ## CFG_BIBCATALOG_SYSTEM -- set desired catalog system. For example, RT. CFG_BIBCATALOG_SYSTEM = ## RT CONFIGURATION ## CFG_BIBCATALOG_SYSTEM_RT_CLI -- path to the RT CLI client CFG_BIBCATALOG_SYSTEM_RT_CLI = /usr/bin/rt ## CFG_BIBCATALOG_SYSTEM_RT_URL -- Base URL of the remote RT system CFG_BIBCATALOG_SYSTEM_RT_URL = http://localhost/rt3 ## CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_USER -- Set the username for a default RT account ## on remote system, with limited privileges, in order to only create and modify own tickets. CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_USER = ## CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_PWD -- Set the password for the default RT account ## on remote system. CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_PWD = #################################### ## Part 19: BibFormat parameters ## #################################### ## CFG_BIBFORMAT_HIDDEN_TAGS -- comma-separated list of MARC tags that ## are not shown to users not having cataloging authorizations. CFG_BIBFORMAT_HIDDEN_TAGS = 595 ## CFG_BIBFORMAT_HIDDEN_FILE_FORMATS -- comma-separated list of file formats ## that are not shown explicitly to user not having cataloging authorizations. ## e.g. pdf;pdfa,xml CFG_BIBFORMAT_HIDDEN_FILE_FORMATS = ## CFG_BIBFORMAT_ADDTHIS_ID -- if you want to use the AddThis service from ## , set this value to the pubid parameter as ## provided by the service (e.g. ra-4ff80aae118f4dad), and add a call to ## formatting element in your formats, for example ## Default_HTML_detailed.bft. CFG_BIBFORMAT_ADDTHIS_ID = ## CFG_BIBFORMAT_DISABLE_I18N_FOR_CACHED_FORMATS -- For each output ## format BibReformat currently creates a cache for only one language ## (CFG_SITE_LANG) per record. This means that visitors having set a ## different language than CFG_SITE_LANG will be served an on-the-fly ## output using the language of their choice. You can disable this ## behaviour by specifying below for which output format you would ## like to force the cache to be used whatever language is ## requested. If your format templates do not provide ## internationalization, you can optimize your site by setting for ## eg. hb,hd to always serve the precached output (if it exists) in ## the CFG_SITE_LANG CFG_BIBFORMAT_DISABLE_I18N_FOR_CACHED_FORMATS = #################################### ## Part 20: BibMatch parameters ## #################################### ## CFG_BIBMATCH_LOCAL_SLEEPTIME -- Determines the amount of seconds to sleep ## between search queries on LOCAL system. CFG_BIBMATCH_LOCAL_SLEEPTIME = 0.0 ## CFG_BIBMATCH_REMOTE_SLEEPTIME -- Determines the amount of seconds to sleep ## between search queries on REMOTE systems. CFG_BIBMATCH_REMOTE_SLEEPTIME = 2.0 ## CFG_BIBMATCH_FUZZY_WORDLIMITS -- Determines the amount of words to extract ## from a certain fields value during fuzzy matching mode. Add/change field ## and appropriate number to the dictionary to configure this. CFG_BIBMATCH_FUZZY_WORDLIMITS = { '100__a': 2, '245__a': 4 } ## CFG_BIBMATCH_FUZZY_EMPTY_RESULT_LIMIT -- Determines the amount of empty results ## to accept during fuzzy matching mode. CFG_BIBMATCH_FUZZY_EMPTY_RESULT_LIMIT = 1 ## CFG_BIBMATCH_QUERY_TEMPLATES -- Here you can set the various predefined querystrings ## used to standardize common matching queries. By default the following templates ## are given: ## title - standard title search. Taken from 245__a (default) ## title-author - title and author search (i.e. this is a title AND author a) ## Taken from 245__a and 100__a ## reportnumber - reportnumber search (i.e. reportnumber:REP-NO-123). CFG_BIBMATCH_QUERY_TEMPLATES = { 'title' : '[title]', 'title-author' : '[title] [author]', 'reportnumber' : 'reportnumber:[reportnumber]' } ## CFG_BIBMATCH_MATCH_VALIDATION_RULESETS -- Here you can define the various rulesets for ## validating search results done by BibMatch. Each ruleset contains a certain pattern mapped ## to a tuple defining a "matching-strategy". ## ## The rule-definitions must come in two parts: ## ## * The first part is a string containing a regular expression ## that is matched against the textmarc representation of each record. ## If a match is found, the final rule-set is updated with ## the given "sub rule-set", where identical tag rules are replaced. ## ## * The second item is a list of key->value mappings (dict) that indicates specific ## strategy parameters with corresponding validation rules. ## ## This strategy consists of five items: ## ## * MARC TAGS: ## These MARC tags represents the fields taken from original record and any records from search ## results. When several MARC tags are specified with a given match-strategy, all the fields ## associated with these tags are matched together (i.e. with key "100__a,700__a", all 100__a ## and 700__a fields are matched together. Which is useful when first-author can vary for ## certain records on different systems). ## ## * COMPARISON THRESHOLD: ## a value between 0.0 and 1.0 specifying the threshold for string matches ## to determine if it is a match or not (using normalized string-distance). ## Normally 0.8 (80% match) is considered to be a close match. ## ## * COMPARISON MODE: ## the parse mode decides how the record datafields are compared: ## - 'strict' : all (sub-)fields are compared, and all must match. Order is significant. ## - 'normal' : all (sub-)fields are compared, and all must match. Order is ignored. ## - 'lazy' : all (sub-)fields are compared with each other and at least one must match ## - 'ignored': the tag is ignored in the match. Used to disable previously defined rules. ## ## * MATCHING MODE: ## the comparison mode decides how the fieldvalues are matched: ## - 'title' : uses a method specialized for comparing titles, e.g. looking for subtitles ## - 'author' : uses a special authorname comparison. Will take initials into account. ## - 'identifier' : special matching for identifiers, stripping away punctuation ## - 'date': matches dates by extracting and comparing the year ## - 'normal': normal string comparison. ## Note: Fields are considered matching when all its subfields or values match. ## ## * RESULT MODE: ## the result mode decides how the results from the comparisons are handled further: ## - 'normal' : a failed match will cause the validation to immediately exit as a failure. ## a successful match will cause the validation to continue on other rules (if any) ## - 'final' : a failed match will cause the validation to immediately exit as a failure. ## a successful match will cause validation to immediately exit as a success. ## - 'joker' : a failed match will cause the validation to continue on other rules (if any). ## a successful match will cause validation to immediately exit as a success. ## ## You can add your own rulesets in the dictionary below. The 'default' ruleset is always applied, ## and should therefore NOT be removed, but can be changed. The tag-rules can also be overwritten ## by other rulesets. ## ## WARNING: Beware that the validation quality is only as good as given rules, so matching results ## are never guaranteed to be accurate, as it is very content-specific. CFG_BIBMATCH_MATCH_VALIDATION_RULESETS = [('default', [{ 'tags' : '245__%,242__%', 'threshold' : 0.8, 'compare_mode' : 'lazy', 'match_mode' : 'title', 'result_mode' : 'normal' }, { 'tags' : '037__a,088__a', 'threshold' : 1.0, 'compare_mode' : 'lazy', 'match_mode' : 'identifier', 'result_mode' : 'final' }, { 'tags' : '100__a,700__a', 'threshold' : 0.8, 'compare_mode' : 'normal', 'match_mode' : 'author', 'result_mode' : 'normal' }, { 'tags' : '773__a', 'threshold' : 1.0, 'compare_mode' : 'lazy', 'match_mode' : 'title', 'result_mode' : 'normal' }]), ('980__ \$\$a(THESIS|Thesis)', [{ 'tags' : '100__a', 'threshold' : 0.8, 'compare_mode' : 'strict', 'match_mode' : 'author', 'result_mode' : 'normal' }, { 'tags' : '700__a,701__a', 'threshold' : 1.0, 'compare_mode' : 'lazy', 'match_mode' : 'author', 'result_mode' : 'normal' }, { 'tags' : '100__a,700__a', 'threshold' : 0.8, 'compare_mode' : 'ignored', 'match_mode' : 'author', 'result_mode' : 'normal' }]), ('260__', [{ 'tags' : '260__c', 'threshold' : 0.8, 'compare_mode' : 'lazy', 'match_mode' : 'date', 'result_mode' : 'normal' }]), ('0247_', [{ 'tags' : '0247_a', 'threshold' : 1.0, 'compare_mode' : 'lazy', 'match_mode' : 'identifier', 'result_mode' : 'final' }]), ('020__', [{ 'tags' : '020__a', 'threshold' : 1.0, 'compare_mode' : 'lazy', 'match_mode' : 'identifier', 'result_mode' : 'joker' }]) ] ## CFG_BIBMATCH_FUZZY_MATCH_VALIDATION_LIMIT -- Determines the minimum percentage of the ## amount of rules to be positively matched when comparing two records. Should the number ## of matches be lower than required matches but equal to or above this limit, ## the match will be considered fuzzy. CFG_BIBMATCH_FUZZY_MATCH_VALIDATION_LIMIT = 0.65 ## CFG_BIBMATCH_SEARCH_RESULT_MATCH_LIMIT -- Determines the maximum amount of search results ## a single search can return before acting as a non-match. CFG_BIBMATCH_SEARCH_RESULT_MATCH_LIMIT = 15 ###################################### ## Part 21: BibAuthorID parameters ## ###################################### # CFG_BIBAUTHORID_MAX_PROCESSES is the max number of processes # that may be spawned by the disambiguation algorithm CFG_BIBAUTHORID_MAX_PROCESSES = 12 # CFG_BIBAUTHORID_PERSONID_SQL_MAX_THREADS is the max number of threads # to parallelize sql queries during personID tables updates CFG_BIBAUTHORID_PERSONID_SQL_MAX_THREADS = 12 # CFG_BIBAUTHORID_EXTERNAL_CLAIMED_RECORDS_KEY defines the user info # keys for externally claimed records in an remote-login scenario--e.g. from arXiv.org # e.g. "external_arxivids" for arXiv SSO CFG_BIBAUTHORID_EXTERNAL_CLAIMED_RECORDS_KEY = # CFG_BIBAUTHORID_AID_ENABLED # Globally enable AuthorID Interfaces. # If False: No guest, user or operator will have access to the system. CFG_BIBAUTHORID_ENABLED = True # CFG_BIBAUTHORID_AID_ON_AUTHORPAGES # Enable AuthorID information on the author pages. CFG_BIBAUTHORID_ON_AUTHORPAGES = True # CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL defines the eMail address # all ticket requests concerning authors will be sent to. CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL = info@invenio-software.org #CFG_BIBAUTHORID_UI_SKIP_ARXIV_STUB_PAGE defines if the optional arXive stub page is skipped CFG_BIBAUTHORID_UI_SKIP_ARXIV_STUB_PAGE = False ###################################### ## Part 22: BibClassify parameters ## ###################################### # CFG_BIBCLASSIFY_WEB_MAXKW -- maximum number of keywords to display # in the Keywords tab web page. CFG_BIBCLASSIFY_WEB_MAXKW = 100 ######################################## ## Part 23: Plotextractor parameters ## ######################################## ## CFG_PLOTEXTRACTOR_SOURCE_BASE_URL -- for acquiring source tarballs for plot ## extraction, where should we look? If nothing is set, we'll just go ## to arXiv, but this can be a filesystem location, too CFG_PLOTEXTRACTOR_SOURCE_BASE_URL = http://arxiv.org/ ## CFG_PLOTEXTRACTOR_SOURCE_TARBALL_FOLDER -- for acquiring source tarballs for plot ## extraction, subfolder where the tarballs sit CFG_PLOTEXTRACTOR_SOURCE_TARBALL_FOLDER = e-print/ ## CFG_PLOTEXTRACTOR_SOURCE_PDF_FOLDER -- for acquiring source tarballs for plot ## extraction, subfolder where the pdf sit CFG_PLOTEXTRACTOR_SOURCE_PDF_FOLDER = pdf/ ## CFG_PLOTEXTRACTOR_DOWNLOAD_TIMEOUT -- a float representing the number of seconds ## to wait between each download of pdf and/or tarball from source URL. CFG_PLOTEXTRACTOR_DOWNLOAD_TIMEOUT = 2.0 ## CFG_PLOTEXTRACTOR_CONTEXT_LIMIT -- when extracting context of plots from ## TeX sources, this is the limitation of characters in each direction to extract ## context from. Default 750. CFG_PLOTEXTRACTOR_CONTEXT_EXTRACT_LIMIT = 750 ## CFG_PLOTEXTRACTOR_DISALLOWED_TEX -- when extracting context of plots from TeX ## sources, this is the list of TeX tags that will trigger 'end of context'. CFG_PLOTEXTRACTOR_DISALLOWED_TEX = begin,end,section,includegraphics,caption,acknowledgements ## CFG_PLOTEXTRACTOR_CONTEXT_WORD_LIMIT -- when extracting context of plots from ## TeX sources, this is the limitation of words in each direction. Default 75. CFG_PLOTEXTRACTOR_CONTEXT_WORD_LIMIT = 75 ## CFG_PLOTEXTRACTOR_CONTEXT_SENTENCE_LIMIT -- when extracting context of plots from ## TeX sources, this is the limitation of sentences in each direction. Default 2. CFG_PLOTEXTRACTOR_CONTEXT_SENTENCE_LIMIT = 2 ###################################### ## Part 24: WebStat parameters ## ###################################### # CFG_WEBSTAT_BIBCIRCULATION_START_YEAR defines the start date of the BibCirculation # statistics. Value should have the format 'yyyy'. If empty, take all existing data. CFG_WEBSTAT_BIBCIRCULATION_START_YEAR = #################################### ## Part 25: BibSort parameters ## #################################### ## CFG_BIBSORT_BUCKETS -- the number of buckets bibsort should use. ## If 0, then no buckets will be used (bibsort will be inactive). ## If different from 0, bibsort will be used for sorting the records. ## The number of buckets should be set with regards to the size ## of the repository; having a larger number of buckets will increase ## the sorting performance for the top results but will decrease ## the performance for sorting the middle results. ## We recommend to to use 1 in case you have less than about ## 1,000,000 records. ## When modifying this variable, re-run rebalancing for all the bibsort ## methods, for having the database in synch. CFG_BIBSORT_BUCKETS = 1 ########################## ## THAT's ALL, FOLKS! ## ########################## diff --git a/modules/miscutil/sql/tabbibclean.sql b/modules/miscutil/sql/tabbibclean.sql index 9e8ddf4bc..24297aa1a 100644 --- a/modules/miscutil/sql/tabbibclean.sql +++ b/modules/miscutil/sql/tabbibclean.sql @@ -1,334 +1,335 @@ TRUNCATE bibrec; TRUNCATE bib00x; TRUNCATE bib01x; TRUNCATE bib02x; TRUNCATE bib03x; TRUNCATE bib04x; TRUNCATE bib05x; TRUNCATE bib06x; TRUNCATE bib07x; TRUNCATE bib08x; TRUNCATE bib09x; TRUNCATE bib10x; TRUNCATE bib11x; TRUNCATE bib12x; TRUNCATE bib13x; TRUNCATE bib14x; TRUNCATE bib15x; TRUNCATE bib16x; TRUNCATE bib17x; TRUNCATE bib18x; TRUNCATE bib19x; TRUNCATE bib20x; TRUNCATE bib21x; TRUNCATE bib22x; TRUNCATE bib23x; TRUNCATE bib24x; TRUNCATE bib25x; TRUNCATE bib26x; TRUNCATE bib27x; TRUNCATE bib28x; TRUNCATE bib29x; TRUNCATE bib30x; TRUNCATE bib31x; TRUNCATE bib32x; TRUNCATE bib33x; TRUNCATE bib34x; TRUNCATE bib35x; TRUNCATE bib36x; TRUNCATE bib37x; TRUNCATE bib38x; TRUNCATE bib39x; TRUNCATE bib40x; TRUNCATE bib41x; TRUNCATE bib42x; TRUNCATE bib43x; TRUNCATE bib44x; TRUNCATE bib45x; TRUNCATE bib46x; TRUNCATE bib47x; TRUNCATE bib48x; TRUNCATE bib49x; TRUNCATE bib50x; TRUNCATE bib51x; TRUNCATE bib52x; TRUNCATE bib53x; TRUNCATE bib54x; TRUNCATE bib55x; TRUNCATE bib56x; TRUNCATE bib57x; TRUNCATE bib58x; TRUNCATE bib59x; TRUNCATE bib60x; TRUNCATE bib61x; TRUNCATE bib62x; TRUNCATE bib63x; TRUNCATE bib64x; TRUNCATE bib65x; TRUNCATE bib66x; TRUNCATE bib67x; TRUNCATE bib68x; TRUNCATE bib69x; TRUNCATE bib70x; TRUNCATE bib71x; TRUNCATE bib72x; TRUNCATE bib73x; TRUNCATE bib74x; TRUNCATE bib75x; TRUNCATE bib76x; TRUNCATE bib77x; TRUNCATE bib78x; TRUNCATE bib79x; TRUNCATE bib80x; TRUNCATE bib81x; TRUNCATE bib82x; TRUNCATE bib83x; TRUNCATE bib84x; TRUNCATE bib85x; TRUNCATE bib86x; TRUNCATE bib87x; TRUNCATE bib88x; TRUNCATE bib89x; TRUNCATE bib90x; TRUNCATE bib91x; TRUNCATE bib92x; TRUNCATE bib93x; TRUNCATE bib94x; TRUNCATE bib95x; TRUNCATE bib96x; TRUNCATE bib97x; TRUNCATE bib98x; TRUNCATE bib99x; TRUNCATE bibrec_bib00x; TRUNCATE bibrec_bib01x; TRUNCATE bibrec_bib02x; TRUNCATE bibrec_bib03x; TRUNCATE bibrec_bib04x; TRUNCATE bibrec_bib05x; TRUNCATE bibrec_bib06x; TRUNCATE bibrec_bib07x; TRUNCATE bibrec_bib08x; TRUNCATE bibrec_bib09x; TRUNCATE bibrec_bib10x; TRUNCATE bibrec_bib11x; TRUNCATE bibrec_bib12x; TRUNCATE bibrec_bib13x; TRUNCATE bibrec_bib14x; TRUNCATE bibrec_bib15x; TRUNCATE bibrec_bib16x; TRUNCATE bibrec_bib17x; TRUNCATE bibrec_bib18x; TRUNCATE bibrec_bib19x; TRUNCATE bibrec_bib20x; TRUNCATE bibrec_bib21x; TRUNCATE bibrec_bib22x; TRUNCATE bibrec_bib23x; TRUNCATE bibrec_bib24x; TRUNCATE bibrec_bib25x; TRUNCATE bibrec_bib26x; TRUNCATE bibrec_bib27x; TRUNCATE bibrec_bib28x; TRUNCATE bibrec_bib29x; TRUNCATE bibrec_bib30x; TRUNCATE bibrec_bib31x; TRUNCATE bibrec_bib32x; TRUNCATE bibrec_bib33x; TRUNCATE bibrec_bib34x; TRUNCATE bibrec_bib35x; TRUNCATE bibrec_bib36x; TRUNCATE bibrec_bib37x; TRUNCATE bibrec_bib38x; TRUNCATE bibrec_bib39x; TRUNCATE bibrec_bib40x; TRUNCATE bibrec_bib41x; TRUNCATE bibrec_bib42x; TRUNCATE bibrec_bib43x; TRUNCATE bibrec_bib44x; TRUNCATE bibrec_bib45x; TRUNCATE bibrec_bib46x; TRUNCATE bibrec_bib47x; TRUNCATE bibrec_bib48x; TRUNCATE bibrec_bib49x; TRUNCATE bibrec_bib50x; TRUNCATE bibrec_bib51x; TRUNCATE bibrec_bib52x; TRUNCATE bibrec_bib53x; TRUNCATE bibrec_bib54x; TRUNCATE bibrec_bib55x; TRUNCATE bibrec_bib56x; TRUNCATE bibrec_bib57x; TRUNCATE bibrec_bib58x; TRUNCATE bibrec_bib59x; TRUNCATE bibrec_bib60x; TRUNCATE bibrec_bib61x; TRUNCATE bibrec_bib62x; TRUNCATE bibrec_bib63x; TRUNCATE bibrec_bib64x; TRUNCATE bibrec_bib65x; TRUNCATE bibrec_bib66x; TRUNCATE bibrec_bib67x; TRUNCATE bibrec_bib68x; TRUNCATE bibrec_bib69x; TRUNCATE bibrec_bib70x; TRUNCATE bibrec_bib71x; TRUNCATE bibrec_bib72x; TRUNCATE bibrec_bib73x; TRUNCATE bibrec_bib74x; TRUNCATE bibrec_bib75x; TRUNCATE bibrec_bib76x; TRUNCATE bibrec_bib77x; TRUNCATE bibrec_bib78x; TRUNCATE bibrec_bib79x; TRUNCATE bibrec_bib80x; TRUNCATE bibrec_bib81x; TRUNCATE bibrec_bib82x; TRUNCATE bibrec_bib83x; TRUNCATE bibrec_bib84x; TRUNCATE bibrec_bib85x; TRUNCATE bibrec_bib86x; TRUNCATE bibrec_bib87x; TRUNCATE bibrec_bib88x; TRUNCATE bibrec_bib89x; TRUNCATE bibrec_bib90x; TRUNCATE bibrec_bib91x; TRUNCATE bibrec_bib92x; TRUNCATE bibrec_bib93x; TRUNCATE bibrec_bib94x; TRUNCATE bibrec_bib95x; TRUNCATE bibrec_bib96x; TRUNCATE bibrec_bib97x; TRUNCATE bibrec_bib98x; TRUNCATE bibrec_bib99x; TRUNCATE bibfmt; TRUNCATE idxWORD01F; TRUNCATE idxWORD02F; TRUNCATE idxWORD03F; TRUNCATE idxWORD04F; TRUNCATE idxWORD05F; TRUNCATE idxWORD06F; TRUNCATE idxWORD07F; TRUNCATE idxWORD08F; TRUNCATE idxWORD09F; TRUNCATE idxWORD10F; TRUNCATE idxWORD11F; TRUNCATE idxWORD12F; TRUNCATE idxWORD13F; TRUNCATE idxWORD14F; TRUNCATE idxWORD15F; TRUNCATE idxWORD16F; TRUNCATE idxWORD17F; TRUNCATE idxWORD18F; TRUNCATE idxWORD01R; TRUNCATE idxWORD02R; TRUNCATE idxWORD03R; TRUNCATE idxWORD04R; TRUNCATE idxWORD05R; TRUNCATE idxWORD06R; TRUNCATE idxWORD07R; TRUNCATE idxWORD08R; TRUNCATE idxWORD09R; TRUNCATE idxWORD10R; TRUNCATE idxWORD11R; TRUNCATE idxWORD12R; TRUNCATE idxWORD13R; TRUNCATE idxWORD14R; TRUNCATE idxWORD15R; TRUNCATE idxWORD16R; TRUNCATE idxWORD17R; TRUNCATE idxWORD18R; TRUNCATE idxPAIR01F; TRUNCATE idxPAIR02F; TRUNCATE idxPAIR03F; TRUNCATE idxPAIR04F; TRUNCATE idxPAIR05F; TRUNCATE idxPAIR06F; TRUNCATE idxPAIR07F; TRUNCATE idxPAIR08F; TRUNCATE idxPAIR09F; TRUNCATE idxPAIR10F; TRUNCATE idxPAIR11F; TRUNCATE idxPAIR12F; TRUNCATE idxPAIR13F; TRUNCATE idxPAIR14F; TRUNCATE idxPAIR15F; TRUNCATE idxPAIR16F; TRUNCATE idxPAIR17F; TRUNCATE idxPAIR18F; TRUNCATE idxPAIR01R; TRUNCATE idxPAIR02R; TRUNCATE idxPAIR03R; TRUNCATE idxPAIR04R; TRUNCATE idxPAIR05R; TRUNCATE idxPAIR06R; TRUNCATE idxPAIR07R; TRUNCATE idxPAIR08R; TRUNCATE idxPAIR09R; TRUNCATE idxPAIR10R; TRUNCATE idxPAIR11R; TRUNCATE idxPAIR12R; TRUNCATE idxPAIR13R; TRUNCATE idxPAIR14R; TRUNCATE idxPAIR15R; TRUNCATE idxPAIR16R; TRUNCATE idxPAIR17R; TRUNCATE idxPAIR18R; TRUNCATE idxPHRASE01F; TRUNCATE idxPHRASE02F; TRUNCATE idxPHRASE03F; TRUNCATE idxPHRASE04F; TRUNCATE idxPHRASE05F; TRUNCATE idxPHRASE06F; TRUNCATE idxPHRASE07F; TRUNCATE idxPHRASE08F; TRUNCATE idxPHRASE09F; TRUNCATE idxPHRASE10F; TRUNCATE idxPHRASE11F; TRUNCATE idxPHRASE12F; TRUNCATE idxPHRASE13F; TRUNCATE idxPHRASE14F; TRUNCATE idxPHRASE15F; TRUNCATE idxPHRASE16F; TRUNCATE idxPHRASE17F; TRUNCATE idxPHRASE18F; TRUNCATE idxPHRASE01R; TRUNCATE idxPHRASE02R; TRUNCATE idxPHRASE03R; TRUNCATE idxPHRASE04R; TRUNCATE idxPHRASE05R; TRUNCATE idxPHRASE06R; TRUNCATE idxPHRASE07R; TRUNCATE idxPHRASE08R; TRUNCATE idxPHRASE09R; TRUNCATE idxPHRASE10R; TRUNCATE idxPHRASE11R; TRUNCATE idxPHRASE12R; TRUNCATE idxPHRASE13R; TRUNCATE idxPHRASE14R; TRUNCATE idxPHRASE15R; TRUNCATE idxPHRASE16R; TRUNCATE idxPHRASE17R; TRUNCATE idxPHRASE18R; TRUNCATE rnkMETHODDATA; TRUNCATE rnkCITATIONDATA; TRUNCATE rnkDOWNLOADS; TRUNCATE rnkPAGEVIEWS; TRUNCATE rnkWORD01F; TRUNCATE rnkWORD01R; TRUNCATE bibdoc; TRUNCATE bibrec_bibdoc; TRUNCATE bibdoc_bibdoc; +TRUNCATE bibdocfsinfo; TRUNCATE sbmAPPROVAL; TRUNCATE sbmSUBMISSIONS; TRUNCATE sbmPUBLICATION; TRUNCATE sbmPUBLICATIONCOMM; TRUNCATE sbmPUBLICATIONDATA; TRUNCATE hstRECORD; TRUNCATE hstDOCUMENT; TRUNCATE bibHOLDINGPEN; TRUNCATE hstEXCEPTION; TRUNCATE aidPERSONIDDATA; TRUNCATE aidRESULTS; TRUNCATE aidPROBCACHE; TRUNCATE aidCACHE; TRUNCATE aidPERSONIDPAPERS; TRUNCATE aidUSERINPUTLOG; diff --git a/modules/miscutil/sql/tabcreate.sql b/modules/miscutil/sql/tabcreate.sql index 17e5b27bd..24dab2db3 100644 --- a/modules/miscutil/sql/tabcreate.sql +++ b/modules/miscutil/sql/tabcreate.sql @@ -1,4074 +1,4094 @@ -- This file is part of Invenio. -- Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN. -- -- Invenio is free software; you can redistribute it and/or -- modify it under the terms of the GNU General Public License as -- published by the Free Software Foundation; either version 2 of the -- License, or (at your option) any later version. -- -- Invenio is distributed in the hope that it will be useful, but -- WITHOUT ANY WARRANTY; without even the implied warranty of -- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -- General Public License for more details. -- -- You should have received a copy of the GNU General Public License -- along with Invenio; if not, write to the Free Software Foundation, Inc., -- 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. -- tables for bibliographic records: CREATE TABLE IF NOT EXISTS bibrec ( id mediumint(8) unsigned NOT NULL auto_increment, creation_date datetime NOT NULL default '0000-00-00', modification_date datetime NOT NULL default '0000-00-00', PRIMARY KEY (id), KEY creation_date (creation_date), KEY modification_date (modification_date) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib00x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib01x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib02x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib03x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib04x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib05x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib06x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib07x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib08x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib09x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib10x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib11x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib12x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib13x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib14x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib15x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib16x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib17x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib18x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib19x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib20x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib21x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib22x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib23x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib24x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib25x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib26x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib27x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib28x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib29x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib30x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib31x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib32x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib33x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib34x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib35x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib36x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib37x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib38x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib39x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib40x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib41x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib42x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib43x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib44x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib45x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib46x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib47x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib48x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib49x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib50x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib51x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib52x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib53x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib54x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib55x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib56x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib57x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib58x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib59x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib60x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib61x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib62x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib63x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib64x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib65x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib66x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib67x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib68x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib69x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib70x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib71x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib72x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib73x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib74x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib75x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib76x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib77x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib78x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib79x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib80x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib81x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib82x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib83x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib84x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib85x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(100)) -- URLs need usually a larger index for speedy lookups ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib86x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib87x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib88x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib89x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib90x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib91x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib92x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib93x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib94x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib95x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib96x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib97x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib98x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bib99x ( id mediumint(8) unsigned NOT NULL auto_increment, tag varchar(6) NOT NULL default '', value text NOT NULL, PRIMARY KEY (id), KEY kt (tag), KEY kv (value(35)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib00x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib01x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib02x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib03x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib04x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib05x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib06x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib07x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib08x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib09x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib10x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib11x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib12x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib13x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib14x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib15x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib16x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib17x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib18x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib19x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib20x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib21x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib22x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib23x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib24x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib25x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib26x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib27x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib28x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib29x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib30x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib31x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib32x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib33x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib34x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib35x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib36x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib37x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib38x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib39x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib40x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib41x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib42x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib43x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib44x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib45x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib46x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib47x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib48x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib49x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib50x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib51x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib52x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib53x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib54x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib55x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib56x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib57x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib58x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib59x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib60x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib61x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib62x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib63x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib64x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib65x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib66x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib67x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib68x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib69x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib70x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib71x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib72x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib73x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib74x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib75x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib76x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib77x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib78x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib79x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib80x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib81x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib82x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib83x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib84x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib85x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib86x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib87x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib88x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib89x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib90x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib91x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib92x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib93x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib94x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib95x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib96x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib97x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib98x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bib99x ( id_bibrec mediumint(8) unsigned NOT NULL default '0', id_bibxxx mediumint(8) unsigned NOT NULL default '0', field_number smallint(5) unsigned default NULL, KEY id_bibxxx (id_bibxxx), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; -- tables for bibliographic records formatted: CREATE TABLE IF NOT EXISTS bibfmt ( id mediumint(8) unsigned NOT NULL auto_increment, id_bibrec int(8) unsigned NOT NULL default '0', format varchar(10) NOT NULL default '', last_updated datetime NOT NULL default '0000-00-00', value longblob, PRIMARY KEY (id), KEY id_bibrec (id_bibrec), KEY format (format) ) ENGINE=MyISAM; -- tables for index files: CREATE TABLE IF NOT EXISTS idxINDEX ( id mediumint(9) unsigned NOT NULL, name varchar(50) NOT NULL default '', description varchar(255) NOT NULL default '', last_updated datetime NOT NULL default '0000-00-00 00:00:00', stemming_language varchar(10) NOT NULL default '', PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxINDEXNAME ( id_idxINDEX mediumint(9) unsigned NOT NULL, ln char(5) NOT NULL default '', type char(3) NOT NULL default 'sn', value varchar(255) NOT NULL, PRIMARY KEY (id_idxINDEX,ln,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxINDEX_field ( id_idxINDEX mediumint(9) unsigned NOT NULL, id_field mediumint(9) unsigned NOT NULL, regexp_punctuation varchar(255) NOT NULL default "[\.\,\:\;\?\!\"]", regexp_alphanumeric_separators varchar(255) NOT NULL default "[\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~]", PRIMARY KEY (id_idxINDEX,id_field) ) ENGINE=MyISAM; -- this comment line here is just to fix the SQL display mode in Emacs ' CREATE TABLE IF NOT EXISTS idxWORD01F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD01R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD02F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD02R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD03F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD03R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD04F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD04R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD05F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD05R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD06F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD06R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD07F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD07R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD08F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD08R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD09F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD09R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD10F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD10R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD11F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD11R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD12F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD12R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD13F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD13R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD14F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD14R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD15F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD15R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD16F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD16R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD17F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD17R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD18F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxWORD18R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR01F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR01R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR02F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR02R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR03F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR03R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR04F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR04R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR05F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR05R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR06F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR06R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR07F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR07R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR08F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR08R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR09F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR09R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR10F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR10R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR11F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR11R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR12F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR12R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR13F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR13R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR14F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR14R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR15F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR15R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR16F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR16R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR17F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR17R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR18F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(100) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPAIR18R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE01F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE01R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE02F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE02R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE03F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE03R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE04F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE04R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE05F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE05R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE06F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE06R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE07F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE07R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE08F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE08R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE09F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE09R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE10F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE10R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE11F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE11R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE12F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE12R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE13F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE13R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE14F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE14R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE15F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE15R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE16F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE16R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE17F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE17R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE18F ( id mediumint(9) unsigned NOT NULL auto_increment, term text default NULL, hitlist longblob, PRIMARY KEY (id), KEY term (term(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS idxPHRASE18R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; -- tables for ranking: CREATE TABLE IF NOT EXISTS rnkMETHOD ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(20) NOT NULL default '', last_updated datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS rnkMETHODNAME ( id_rnkMETHOD mediumint(9) unsigned NOT NULL, ln char(5) NOT NULL default '', type char(3) NOT NULL default 'sn', value varchar(255) NOT NULL, PRIMARY KEY (id_rnkMETHOD,ln,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS rnkMETHODDATA ( id_rnkMETHOD mediumint(9) unsigned NOT NULL, relevance_data longblob, PRIMARY KEY (id_rnkMETHOD) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS collection_rnkMETHOD ( id_collection mediumint(9) unsigned NOT NULL, id_rnkMETHOD mediumint(9) unsigned NOT NULL, score tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_collection,id_rnkMETHOD) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS rnkWORD01F ( id mediumint(9) unsigned NOT NULL auto_increment, term varchar(50) default NULL, hitlist longblob, PRIMARY KEY (id), UNIQUE KEY term (term) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS rnkWORD01R ( id_bibrec mediumint(9) unsigned NOT NULL, termlist longblob, type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT', PRIMARY KEY (id_bibrec,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS rnkAUTHORDATA ( aterm varchar(50) default NULL, hitlist longblob, UNIQUE KEY aterm (aterm) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS rnkPAGEVIEWS ( id_bibrec mediumint(8) unsigned default NULL, id_user int(15) unsigned default '0', client_host int(10) unsigned default NULL, view_time datetime default '0000-00-00 00:00:00', KEY view_time (view_time), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS rnkDOWNLOADS ( id_bibrec mediumint(8) unsigned default NULL, download_time datetime default '0000-00-00 00:00:00', client_host int(10) unsigned default NULL, id_user int(15) unsigned default NULL, id_bibdoc mediumint(9) unsigned default NULL, file_version smallint(2) unsigned default NULL, file_format varchar(10) NULL default NULL, KEY download_time (download_time), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; -- a table for citations. record-cites-record CREATE TABLE IF NOT EXISTS rnkCITATIONDATA ( id mediumint(8) unsigned NOT NULL auto_increment, object_name varchar(255) NOT NULL, object_value longblob, last_updated datetime NOT NULL default '0000-00-00', PRIMARY KEY id (id), UNIQUE KEY object_name (object_name) ) ENGINE=MyISAM; -- a table for missing citations. This should be scanned by a program -- occasionally to check if some publication has been cited more than -- 50 times (or such), and alert cataloguers to create record for that -- external citation -- -- id_bibrec is the id of the record. extcitepubinfo is publication info -- that looks in general like hep-th/0112088 CREATE TABLE IF NOT EXISTS rnkCITATIONDATAEXT ( id_bibrec int(8) unsigned, extcitepubinfo varchar(255) NOT NULL, PRIMARY KEY (id_bibrec, extcitepubinfo), KEY extcitepubinfo (extcitepubinfo) ) ENGINE=MyISAM; -- tables for collections and collection tree: CREATE TABLE IF NOT EXISTS collection ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL, dbquery text, nbrecs int(10) unsigned default '0', reclist longblob, PRIMARY KEY (id), UNIQUE KEY name (name), KEY dbquery (dbquery(50)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS collectionname ( id_collection mediumint(9) unsigned NOT NULL, ln char(5) NOT NULL default '', type char(3) NOT NULL default 'sn', value varchar(255) NOT NULL, PRIMARY KEY (id_collection,ln,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS collection_collection ( id_dad mediumint(9) unsigned NOT NULL, id_son mediumint(9) unsigned NOT NULL, type char(1) NOT NULL default 'r', score tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_dad,id_son) ) ENGINE=MyISAM; -- tables for OAI sets: CREATE TABLE IF NOT EXISTS oaiREPOSITORY ( id mediumint(9) unsigned NOT NULL auto_increment, setName varchar(255) NOT NULL default '', setSpec varchar(255) NOT NULL default 'GLOBAL_SET', setCollection varchar(255) NOT NULL default '', setDescription text NOT NULL default '', setDefinition text NOT NULL default '', setRecList longblob, p1 text NOT NULL default '', f1 text NOT NULL default '', m1 text NOT NULL default '', p2 text NOT NULL default '', f2 text NOT NULL default '', m2 text NOT NULL default '', p3 text NOT NULL default '', f3 text NOT NULL default '', m3 text NOT NULL default '', PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS oaiHARVEST ( id mediumint(9) unsigned NOT NULL auto_increment, baseurl varchar(255) NOT NULL default '', metadataprefix varchar(255) NOT NULL default 'oai_dc', arguments text, comment text, bibconvertcfgfile varchar(255), name varchar(255) NOT NULL, lastrun datetime, frequency mediumint(12) NOT NULL default '0', postprocess varchar(20) NOT NULL default 'h', bibfilterprogram varchar(255) NOT NULL default '', setspecs text NOT NULL default '', PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS oaiHARVESTLOG ( id_oaiHARVEST mediumint(9) unsigned NOT NULL REFERENCES oaiHARVEST, -- source we harvest from id_bibrec mediumint(8) unsigned NOT NULL default '0', -- internal record id ( filled by bibupload ) bibupload_task_id int NOT NULL default 0, -- bib upload task number oai_id varchar(40) NOT NULL default "", -- OAI record identifier we harvested date_harvested datetime NOT NULL default '0000-00-00', -- when we harvested date_inserted datetime NOT NULL default '0000-00-00', -- when it was inserted inserted_to_db char(1) NOT NULL default 'P', -- where it was inserted (P=prod, H=holding-pen, etc) PRIMARY KEY (bibupload_task_id, oai_id, date_harvested) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibHOLDINGPEN ( changeset_id INT NOT NULL AUTO_INCREMENT, -- the identifier of the changeset stored in the holding pen changeset_date datetime NOT NULL DEFAULT '0000:00:00 00:00:00', -- when was the changeset inserted changeset_xml TEXT NOT NULL DEFAULT '', oai_id varchar(40) NOT NULL DEFAULT '', -- OAI identifier of concerned record id_bibrec mediumint(8) unsigned NOT NULL default '0', -- record ID of concerned record (filled by bibupload) PRIMARY KEY (changeset_id), KEY changeset_date (changeset_date), KEY id_bibrec (id_bibrec) ) ENGINE=MyISAM; -- tables for portal elements: CREATE TABLE IF NOT EXISTS collection_portalbox ( id_collection mediumint(9) unsigned NOT NULL, id_portalbox mediumint(9) unsigned NOT NULL, ln char(5) NOT NULL default '', position char(3) NOT NULL default 'top', score tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_collection,id_portalbox,ln) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS portalbox ( id mediumint(9) unsigned NOT NULL auto_increment, title text NOT NULL, body text NOT NULL, UNIQUE KEY id (id) ) ENGINE=MyISAM; -- tables for search examples: CREATE TABLE IF NOT EXISTS collection_example ( id_collection mediumint(9) unsigned NOT NULL, id_example mediumint(9) unsigned NOT NULL, score tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_collection,id_example) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS example ( id mediumint(9) unsigned NOT NULL auto_increment, type text NOT NULL default '', body text NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; -- tables for collection formats: CREATE TABLE IF NOT EXISTS collection_format ( id_collection mediumint(9) unsigned NOT NULL, id_format mediumint(9) unsigned NOT NULL, score tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_collection,id_format) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS format ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL, code varchar(6) NOT NULL, description varchar(255) default '', content_type varchar(255) default '', visibility tinyint NOT NULL default '1', PRIMARY KEY (id), UNIQUE KEY code (code) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS formatname ( id_format mediumint(9) unsigned NOT NULL, ln char(5) NOT NULL default '', type char(3) NOT NULL default 'sn', value varchar(255) NOT NULL, PRIMARY KEY (id_format,ln,type) ) ENGINE=MyISAM; -- tables for collection detailed page options CREATE TABLE IF NOT EXISTS collectiondetailedrecordpagetabs ( id_collection mediumint(9) unsigned NOT NULL, tabs varchar(255) NOT NULL default '', PRIMARY KEY (id_collection) ) ENGINE=MyISAM; -- tables for search options and MARC tags: CREATE TABLE IF NOT EXISTS collection_field_fieldvalue ( id_collection mediumint(9) unsigned NOT NULL, id_field mediumint(9) unsigned NOT NULL, id_fieldvalue mediumint(9) unsigned, type char(3) NOT NULL default 'src', score tinyint(4) unsigned NOT NULL default '0', score_fieldvalue tinyint(4) unsigned NOT NULL default '0', KEY id_collection (id_collection), KEY id_field (id_field), KEY id_fieldvalue (id_fieldvalue) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS field ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL, code varchar(255) NOT NULL, PRIMARY KEY (id), UNIQUE KEY code (code) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS fieldname ( id_field mediumint(9) unsigned NOT NULL, ln char(5) NOT NULL default '', type char(3) NOT NULL default 'sn', value varchar(255) NOT NULL, PRIMARY KEY (id_field,ln,type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS fieldvalue ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL, value text NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS field_tag ( id_field mediumint(9) unsigned NOT NULL, id_tag mediumint(9) unsigned NOT NULL, score tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_field,id_tag) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS tag ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL, value char(6) NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; -- tables for file management CREATE TABLE IF NOT EXISTS bibdoc ( id mediumint(9) unsigned NOT NULL auto_increment, status text NOT NULL default '', docname varchar(250) COLLATE utf8_bin NOT NULL default 'file', creation_date datetime NOT NULL default '0000-00-00', modification_date datetime NOT NULL default '0000-00-00', text_extraction_date datetime NOT NULL default '0000-00-00', more_info mediumblob NULL default NULL, PRIMARY KEY (id), KEY docname (docname), KEY creation_date (creation_date), KEY modification_date (modification_date) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibrec_bibdoc ( id_bibrec mediumint(9) unsigned NOT NULL default '0', id_bibdoc mediumint(9) unsigned NOT NULL default '0', type varchar(255), KEY (id_bibrec), KEY (id_bibdoc) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bibdoc_bibdoc ( id_bibdoc1 mediumint(9) unsigned NOT NULL, id_bibdoc2 mediumint(9) unsigned NOT NULL, type varchar(255), KEY (id_bibdoc1), KEY (id_bibdoc2) ) ENGINE=MyISAM; +CREATE TABLE IF NOT EXISTS bibdocfsinfo ( + id_bibdoc mediumint(9) unsigned NOT NULL, + version tinyint(4) unsigned NOT NULL, + format varchar(50) NOT NULL, + last_version boolean NOT NULL, + cd datetime NOT NULL, + md datetime NOT NULL, + checksum char(32) NOT NULL, + filesize bigint(15) unsigned NOT NULL, + mime varchar(100) NOT NULL, + master_format varchar(50) NULL default NULL, + PRIMARY KEY (id_bibdoc, version, format), + KEY (last_version), + KEY (format), + KEY (cd), + KEY (md), + KEY (filesize), + KEY (mime) +) ENGINE=MyISAM; + -- tables for publication requests: CREATE TABLE IF NOT EXISTS publreq ( id int(11) NOT NULL auto_increment, host varchar(255) NOT NULL default '', date varchar(255) NOT NULL default '', name varchar(255) NOT NULL default '', email varchar(255) NOT NULL default '', address text NOT NULL, publication text NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; -- table for sessions and users: CREATE TABLE IF NOT EXISTS session ( session_key varchar(32) NOT NULL default '', session_expiry datetime NOT NULL default '0000-00-00 00:00:00', session_object longblob, uid int(15) unsigned NOT NULL, UNIQUE KEY session_key (session_key), KEY uid (uid), KEY session_expiry (session_expiry) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS user ( id int(15) unsigned NOT NULL auto_increment, email varchar(255) NOT NULL default '', password blob NOT NULL, note varchar(255) default NULL, settings blob default NULL, nickname varchar(255) NOT NULL default '', last_login datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY id (id), KEY email (email), KEY nickname (nickname) ) ENGINE=MyISAM; -- tables for usergroups CREATE TABLE IF NOT EXISTS usergroup ( id int(15) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL default '', description text default '', join_policy char(2) NOT NULL default '', login_method varchar(255) NOT NULL default 'INTERNAL', PRIMARY KEY (id), UNIQUE KEY login_method_name (login_method(70), name), KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS user_usergroup ( id_user int(15) unsigned NOT NULL default '0', id_usergroup int(15) unsigned NOT NULL default '0', user_status char(1) NOT NULL default '', user_status_date datetime NOT NULL default '0000-00-00 00:00:00', KEY id_user (id_user), KEY id_usergroup (id_usergroup) ) ENGINE=MyISAM; -- tables for access control engine CREATE TABLE IF NOT EXISTS accROLE ( id int(15) unsigned NOT NULL auto_increment, name varchar(32), description varchar(255), firerole_def_ser blob NULL, firerole_def_src text NULL, PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS user_accROLE ( id_user int(15) unsigned NOT NULL, id_accROLE int(15) unsigned NOT NULL, expiration datetime NOT NULL default '9999-12-31 23:59:59', PRIMARY KEY (id_user, id_accROLE) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS accMAILCOOKIE ( id int(15) unsigned NOT NULL auto_increment, data blob NOT NULL, expiration datetime NOT NULL default '9999-12-31 23:59:59', kind varchar(32) NOT NULL, onetime boolean NOT NULL default 0, status char(1) NOT NULL default 'W', PRIMARY KEY (id), KEY expiration (expiration) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS accACTION ( id int(15) unsigned NOT NULL auto_increment, name varchar(32), description varchar(255), allowedkeywords varchar(255), optional ENUM ('yes', 'no') NOT NULL default 'no', PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS accARGUMENT ( id int(15) unsigned NOT NULL auto_increment, keyword varchar (32), value varchar(255), PRIMARY KEY (id), KEY KEYVAL (keyword, value) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS accROLE_accACTION_accARGUMENT ( id_accROLE int(15), id_accACTION int(15), id_accARGUMENT int(15), argumentlistid mediumint(8), KEY id_accROLE (id_accROLE), KEY id_accACTION (id_accACTION), KEY id_accARGUMENT (id_accARGUMENT) ) ENGINE=MyISAM; -- tables for personal/collaborative features (baskets, alerts, searches, messages, usergroups): CREATE TABLE IF NOT EXISTS user_query ( id_user int(15) unsigned NOT NULL default '0', id_query int(15) unsigned NOT NULL default '0', hostname varchar(50) default 'unknown host', date datetime default NULL, KEY id_user (id_user,id_query) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS query ( id int(15) unsigned NOT NULL auto_increment, type char(1) NOT NULL default 'r', urlargs text NOT NULL, PRIMARY KEY (id), KEY urlargs (urlargs(100)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS user_query_basket ( id_user int(15) unsigned NOT NULL default '0', id_query int(15) unsigned NOT NULL default '0', id_basket int(15) unsigned NOT NULL default '0', frequency varchar(5) NOT NULL default '', date_creation date default NULL, date_lastrun date default '0000-00-00', alert_name varchar(30) NOT NULL default '', alert_desc text default NULL, notification char(1) NOT NULL default 'y', PRIMARY KEY (id_user,id_query,frequency,id_basket), KEY alert_name (alert_name) ) ENGINE=MyISAM; -- baskets CREATE TABLE IF NOT EXISTS bskBASKET ( id int(15) unsigned NOT NULL auto_increment, id_owner int(15) unsigned NOT NULL default '0', name varchar(50) NOT NULL default '', date_modification datetime NOT NULL default '0000-00-00 00:00:00', nb_views int(15) NOT NULL default '0', PRIMARY KEY (id), KEY id_owner (id_owner), KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bskREC ( id_bibrec_or_bskEXTREC int(16) NOT NULL default '0', id_bskBASKET int(15) unsigned NOT NULL default '0', id_user_who_added_item int(15) NOT NULL default '0', score int(15) NOT NULL default '0', date_added datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (id_bibrec_or_bskEXTREC,id_bskBASKET), KEY id_bibrec_or_bskEXTREC (id_bibrec_or_bskEXTREC), KEY id_bskBASKET (id_bskBASKET), KEY score (score), KEY date_added (date_added) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bskEXTREC ( id int(15) unsigned NOT NULL auto_increment, external_id int(15) NOT NULL default '0', collection_id int(15) unsigned NOT NULL default '0', original_url text, creation_date datetime NOT NULL default '0000-00-00 00:00:00', modification_date datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bskEXTFMT ( id int(15) unsigned NOT NULL auto_increment, id_bskEXTREC int(15) unsigned NOT NULL default '0', format varchar(10) NOT NULL default '', last_updated datetime NOT NULL default '0000-00-00 00:00:00', value longblob, PRIMARY KEY (id), KEY id_bskEXTREC (id_bskEXTREC), KEY format (format) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS user_bskBASKET ( id_user int(15) unsigned NOT NULL default '0', id_bskBASKET int(15) unsigned NOT NULL default '0', topic varchar(50) NOT NULL default '', PRIMARY KEY (id_user,id_bskBASKET), KEY id_user (id_user), KEY id_bskBASKET (id_bskBASKET) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS usergroup_bskBASKET ( id_usergroup int(15) unsigned NOT NULL default '0', id_bskBASKET int(15) unsigned NOT NULL default '0', topic varchar(50) NOT NULL default '', date_shared datetime NOT NULL default '0000-00-00 00:00:00', share_level char(2) NOT NULL default '', PRIMARY KEY (id_usergroup,id_bskBASKET), KEY id_usergroup (id_usergroup), KEY id_bskBASKET (id_bskBASKET) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bskRECORDCOMMENT ( id int(15) unsigned NOT NULL auto_increment, id_bibrec_or_bskEXTREC int(16) NOT NULL default '0', id_bskBASKET int(15) unsigned NOT NULL default '0', id_user int(15) unsigned NOT NULL default '0', title varchar(255) NOT NULL default '', body text NOT NULL, date_creation datetime NOT NULL default '0000-00-00 00:00:00', priority int(15) NOT NULL default '0', in_reply_to_id_bskRECORDCOMMENT int(15) unsigned NOT NULL default '0', reply_order_cached_data blob NULL default NULL, PRIMARY KEY (id), KEY id_bskBASKET (id_bskBASKET), KEY id_bibrec_or_bskEXTREC (id_bibrec_or_bskEXTREC), KEY date_creation (date_creation), KEY in_reply_to_id_bskRECORDCOMMENT (in_reply_to_id_bskRECORDCOMMENT), INDEX (reply_order_cached_data(40)) ) ENGINE=MyISAM; -- tables for messaging system CREATE TABLE IF NOT EXISTS msgMESSAGE ( id int(15) unsigned NOT NULL auto_increment, id_user_from int(15) unsigned NOT NULL default '0', sent_to_user_nicks text NOT NULL default '', sent_to_group_names text NOT NULL default '', subject text NOT NULL default '', body text default NULL, sent_date datetime NOT NULL default '0000-00-00 00:00:00', received_date datetime NULL default '0000-00-00 00:00:00', PRIMARY KEY id (id), KEY id_user_from (id_user_from) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS user_msgMESSAGE ( id_user_to int(15) unsigned NOT NULL default '0', id_msgMESSAGE int(15) unsigned NOT NULL default '0', status char(1) NOT NULL default 'N', PRIMARY KEY id (id_user_to, id_msgMESSAGE), KEY id_user_to (id_user_to), KEY id_msgMESSAGE (id_msgMESSAGE) ) ENGINE=MyISAM; -- tables for WebComment CREATE TABLE IF NOT EXISTS cmtRECORDCOMMENT ( id int(15) unsigned NOT NULL auto_increment, id_bibrec int(15) unsigned NOT NULL default '0', id_user int(15) unsigned NOT NULL default '0', title varchar(255) NOT NULL default '', body text NOT NULL default '', date_creation datetime NOT NULL default '0000-00-00 00:00:00', star_score tinyint(5) unsigned NOT NULL default '0', nb_votes_yes int(10) NOT NULL default '0', nb_votes_total int(10) unsigned NOT NULL default '0', nb_abuse_reports int(10) NOT NULL default '0', status char(2) NOT NULL default 'ok', round_name varchar(255) NOT NULL default '', restriction varchar(50) NOT NULL default '', in_reply_to_id_cmtRECORDCOMMENT int(15) unsigned NOT NULL default '0', reply_order_cached_data blob NULL default NULL, PRIMARY KEY (id), KEY id_bibrec (id_bibrec), KEY id_user (id_user), KEY status (status), KEY in_reply_to_id_cmtRECORDCOMMENT (in_reply_to_id_cmtRECORDCOMMENT), INDEX (reply_order_cached_data(40)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS cmtACTIONHISTORY ( id_cmtRECORDCOMMENT int(15) unsigned NULL, id_bibrec int(15) unsigned NULL, id_user int(15) unsigned NULL default NULL, client_host int(10) unsigned default NULL, action_time datetime NOT NULL default '0000-00-00 00:00:00', action_code char(1) NOT NULL, KEY id_cmtRECORDCOMMENT (id_cmtRECORDCOMMENT), KEY client_host (client_host), KEY id_user (id_user), KEY action_code (action_code) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS cmtSUBSCRIPTION ( id_bibrec mediumint(8) unsigned NOT NULL, id_user int(15) unsigned NOT NULL, creation_time datetime NOT NULL default '0000-00-00 00:00:00', KEY id_user (id_bibrec, id_user) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS cmtCOLLAPSED ( id_bibrec int(15) unsigned NOT NULL default '0', id_cmtRECORDCOMMENT int(15) unsigned NULL, id_user int(15) unsigned NOT NULL, PRIMARY KEY (id_user, id_bibrec, id_cmtRECORDCOMMENT) ) ENGINE=MyISAM; -- tables for BibKnowledge: CREATE TABLE IF NOT EXISTS knwKB ( id mediumint(8) unsigned NOT NULL auto_increment, name varchar(255) default '', description text default '', kbtype char default NULL, PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS knwKBRVAL ( id mediumint(8) unsigned NOT NULL auto_increment, m_key varchar(255) NOT NULL default '', m_value text NOT NULL default '', id_knwKB mediumint(8) NOT NULL default '0', PRIMARY KEY (id), KEY id_knwKB (id_knwKB), KEY m_key (m_key(30)), KEY m_value (m_value(30)) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS knwKBDDEF ( id_knwKB mediumint(8) unsigned NOT NULL, id_collection mediumint(9), output_tag text default '', search_expression text default '', PRIMARY KEY (id_knwKB) ) ENGINE=MyISAM; -- tables for WebSubmit: CREATE TABLE IF NOT EXISTS sbmACTION ( lactname text, sactname char(3) NOT NULL default '', dir text, cd date default NULL, md date default NULL, actionbutton text, statustext text, PRIMARY KEY (sactname) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmALLFUNCDESCR ( function varchar(40) NOT NULL default '', description tinytext, PRIMARY KEY (function) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmAPPROVAL ( doctype varchar(10) NOT NULL default '', categ varchar(50) NOT NULL default '', rn varchar(50) NOT NULL default '', status varchar(10) NOT NULL default '', dFirstReq datetime NOT NULL default '0000-00-00 00:00:00', dLastReq datetime NOT NULL default '0000-00-00 00:00:00', dAction datetime NOT NULL default '0000-00-00 00:00:00', access varchar(20) NOT NULL default '0', note text NOT NULL default '', PRIMARY KEY (rn) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmCPLXAPPROVAL ( doctype varchar(10) NOT NULL default '', categ varchar(50) NOT NULL default '', rn varchar(50) NOT NULL default '', type varchar(10) NOT NULL, status varchar(10) NOT NULL, id_group int(15) unsigned NOT NULL default '0', id_bskBASKET int(15) unsigned NOT NULL default '0', id_EdBoardGroup int(15) unsigned NOT NULL default '0', dFirstReq datetime NOT NULL default '0000-00-00 00:00:00', dLastReq datetime NOT NULL default '0000-00-00 00:00:00', dEdBoardSel datetime NOT NULL default '0000-00-00 00:00:00', dRefereeSel datetime NOT NULL default '0000-00-00 00:00:00', dRefereeRecom datetime NOT NULL default '0000-00-00 00:00:00', dEdBoardRecom datetime NOT NULL default '0000-00-00 00:00:00', dPubComRecom datetime NOT NULL default '0000-00-00 00:00:00', dProjectLeaderAction datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (rn, type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmCOLLECTION ( id int(11) NOT NULL auto_increment, name varchar(100) NOT NULL default '', PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmCOLLECTION_sbmCOLLECTION ( id_father int(11) NOT NULL default '0', id_son int(11) NOT NULL default '0', catalogue_order int(11) NOT NULL default '0' ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmCOLLECTION_sbmDOCTYPE ( id_father int(11) NOT NULL default '0', id_son char(10) NOT NULL default '0', catalogue_order int(11) NOT NULL default '0' ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmCATEGORIES ( doctype varchar(10) NOT NULL default '', sname varchar(75) NOT NULL default '', lname varchar(75) NOT NULL default '', score tinyint unsigned NOT NULL default 0, PRIMARY KEY (doctype, sname), KEY doctype (doctype), KEY sname (sname) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmCHECKS ( chname varchar(15) NOT NULL default '', chdesc text, cd date default NULL, md date default NULL, chefi1 text, chefi2 text, PRIMARY KEY (chname) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmDOCTYPE ( ldocname text, sdocname varchar(10) default NULL, cd date default NULL, md date default NULL, description text ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmFIELD ( subname varchar(13) default NULL, pagenb int(11) default NULL, fieldnb int(11) default NULL, fidesc varchar(15) default NULL, fitext text, level char(1) default NULL, sdesc text, checkn text, cd date default NULL, md date default NULL, fiefi1 text, fiefi2 text ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmFIELDDESC ( name varchar(15) NOT NULL default '', alephcode varchar(50) default NULL, marccode varchar(50) NOT NULL default '', type char(1) default NULL, size int(11) default NULL, rows int(11) default NULL, cols int(11) default NULL, maxlength int(11) default NULL, val text, fidesc text, cd date default NULL, md date default NULL, modifytext text, fddfi2 text, cookie int(11) default '0', PRIMARY KEY (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmFORMATEXTENSION ( FILE_FORMAT text NOT NULL, FILE_EXTENSION text NOT NULL ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmFUNCTIONS ( action varchar(10) NOT NULL default '', doctype varchar(10) NOT NULL default '', function varchar(40) NOT NULL default '', score int(11) NOT NULL default '0', step tinyint(4) NOT NULL default '1' ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmFUNDESC ( function varchar(40) NOT NULL default '', param varchar(40) default NULL ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmGFILERESULT ( FORMAT text NOT NULL, RESULT text NOT NULL ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmIMPLEMENT ( docname varchar(10) default NULL, actname char(3) default NULL, displayed char(1) default NULL, subname varchar(13) default NULL, nbpg int(11) default NULL, cd date default NULL, md date default NULL, buttonorder int(11) default NULL, statustext text, level char(1) NOT NULL default '', score int(11) NOT NULL default '0', stpage int(11) NOT NULL default '0', endtxt varchar(100) NOT NULL default '' ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmPARAMETERS ( doctype varchar(10) NOT NULL default '', name varchar(40) NOT NULL default '', value text NOT NULL default '', PRIMARY KEY (doctype,name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmPUBLICATION ( doctype varchar(10) NOT NULL default '', categ varchar(50) NOT NULL default '', rn varchar(50) NOT NULL default '', status varchar(10) NOT NULL default '', dFirstReq datetime NOT NULL default '0000-00-00 00:00:00', dLastReq datetime NOT NULL default '0000-00-00 00:00:00', dAction datetime NOT NULL default '0000-00-00 00:00:00', accessref varchar(20) NOT NULL default '', accessedi varchar(20) NOT NULL default '', access varchar(20) NOT NULL default '', referees varchar(50) NOT NULL default '', authoremail varchar(50) NOT NULL default '', dRefSelection datetime NOT NULL default '0000-00-00 00:00:00', dRefRec datetime NOT NULL default '0000-00-00 00:00:00', dEdiRec datetime NOT NULL default '0000-00-00 00:00:00', accessspo varchar(20) NOT NULL default '', journal varchar(100) default NULL, PRIMARY KEY (doctype,categ,rn) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmPUBLICATIONCOMM ( id int(11) NOT NULL auto_increment, id_parent int(11) default '0', rn varchar(100) NOT NULL default '', firstname varchar(100) default NULL, secondname varchar(100) default NULL, email varchar(100) default NULL, date varchar(40) NOT NULL default '', synopsis varchar(255) NOT NULL default '', commentfulltext text, PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmPUBLICATIONDATA ( doctype varchar(10) NOT NULL default '', editoboard varchar(250) NOT NULL default '', base varchar(10) NOT NULL default '', logicalbase varchar(10) NOT NULL default '', spokesperson varchar(50) NOT NULL default '', PRIMARY KEY (doctype) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmREFEREES ( doctype varchar(10) NOT NULL default '', categ varchar(10) NOT NULL default '', name varchar(50) NOT NULL default '', address varchar(50) NOT NULL default '', rid int(11) NOT NULL auto_increment, PRIMARY KEY (rid) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmSUBMISSIONS ( email varchar(50) NOT NULL default '', doctype varchar(10) NOT NULL default '', action varchar(10) NOT NULL default '', status varchar(10) NOT NULL default '', id varchar(30) NOT NULL default '', reference varchar(40) NOT NULL default '', cd datetime NOT NULL default '0000-00-00 00:00:00', md datetime NOT NULL default '0000-00-00 00:00:00' ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS sbmCOOKIES ( id int(15) unsigned NOT NULL auto_increment, name varchar(100) NOT NULL, value text, uid int(15) NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; -- Scheduler tables CREATE TABLE IF NOT EXISTS schTASK ( id int(15) unsigned NOT NULL auto_increment, proc varchar(255) NOT NULL, host varchar(255) NOT NULL default '', user varchar(50) NOT NULL, runtime datetime NOT NULL, sleeptime varchar(20), arguments mediumblob, status varchar(50), progress varchar(255), priority tinyint(4) NOT NULL default 0, sequenceid int(8) NULL default NULL, PRIMARY KEY (id), KEY status (status), KEY runtime (runtime), KEY priority (priority), KEY sequenceid (sequenceid) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS hstTASK ( id int(15) unsigned NOT NULL, proc varchar(255) NOT NULL, host varchar(255) NOT NULL default '', user varchar(50) NOT NULL, runtime datetime NOT NULL, sleeptime varchar(20), arguments mediumblob, status varchar(50), progress varchar(255), priority tinyint(4) NOT NULL default 0, sequenceid int(8) NULL default NULL, PRIMARY KEY (id), KEY status (status), KEY runtime (runtime), KEY priority (priority), KEY sequenceid (sequenceid) ) ENGINE=MyISAM; -- Batch Upload History CREATE TABLE IF NOT EXISTS hstBATCHUPLOAD ( id int(15) unsigned NOT NULL auto_increment, user varchar(50) NOT NULL, submitdate datetime NOT NULL, filename varchar(255) NOT NULL, execdate datetime NOT NULL, id_schTASK int(15) unsigned NOT NULL, batch_mode varchar(15) NOT NULL, PRIMARY KEY (id), KEY user (user) ) ENGINE=MyISAM; -- External collections CREATE TABLE IF NOT EXISTS collection_externalcollection ( id_collection mediumint(9) unsigned NOT NULL default '0', id_externalcollection mediumint(9) unsigned NOT NULL default '0', type tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_collection, id_externalcollection) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS externalcollection ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL default '', PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; -- WebStat tables: CREATE TABLE IF NOT EXISTS staEVENT ( id varchar(255) NOT NULL, number smallint(2) unsigned ZEROFILL NOT NULL auto_increment, name varchar(255), creation_time TIMESTAMP DEFAULT NOW(), cols varchar(255), PRIMARY KEY (id), UNIQUE KEY number (number) ) ENGINE=MyISAM; -- BibClassify tables: CREATE TABLE IF NOT EXISTS clsMETHOD ( id mediumint(9) unsigned NOT NULL, name varchar(50) NOT NULL default '', location varchar(255) NOT NULL default '', description varchar(255) NOT NULL default '', last_updated datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS collection_clsMETHOD ( id_collection mediumint(9) unsigned NOT NULL, id_clsMETHOD mediumint(9) unsigned NOT NULL, PRIMARY KEY (id_collection, id_clsMETHOD) ) ENGINE=MyISAM; -- WebJournal tables: CREATE TABLE IF NOT EXISTS jrnJOURNAL ( id mediumint(9) unsigned NOT NULL auto_increment, name varchar(50) NOT NULL default '', PRIMARY KEY (id), UNIQUE KEY name (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS jrnISSUE ( id_jrnJOURNAL mediumint(9) unsigned NOT NULL, issue_number varchar(50) NOT NULL default '', issue_display varchar(50) NOT NULL default '', date_released datetime NOT NULL default '0000-00-00 00:00:00', date_announced datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (id_jrnJOURNAL,issue_number) ) ENGINE=MyISAM; -- tables recording history of record's metadata and fulltext documents: CREATE TABLE IF NOT EXISTS hstRECORD ( id_bibrec mediumint(8) unsigned NOT NULL, marcxml blob NOT NULL, job_id mediumint(15) unsigned NOT NULL, job_name varchar(255) NOT NULL, job_person varchar(255) NOT NULL, job_date datetime NOT NULL, job_details blob NOT NULL, KEY (id_bibrec), KEY (job_id), KEY (job_name), KEY (job_person), KEY (job_date) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS hstDOCUMENT ( id_bibdoc mediumint(9) unsigned NOT NULL, docname varchar(250) NOT NULL, docformat varchar(50) NOT NULL, docversion tinyint(4) unsigned NOT NULL, docsize bigint(15) unsigned NOT NULL, docchecksum char(32) NOT NULL, doctimestamp datetime NOT NULL, action varchar(50) NOT NULL, job_id mediumint(15) unsigned NULL default NULL, job_name varchar(255) NULL default NULL, job_person varchar(255) NULL default NULL, job_date datetime NULL default NULL, job_details blob NULL default NULL, KEY (action), KEY (id_bibdoc), KEY (docname), KEY (docformat), KEY (doctimestamp), KEY (job_id), KEY (job_name), KEY (job_person), KEY (job_date) ) ENGINE=MyISAM; -- BibCirculation tables: CREATE TABLE IF NOT EXISTS crcBORROWER ( id int(15) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL default '', email varchar(255) NOT NULL default '', phone varchar(60) default NULL, address varchar(60) default NULL, mailbox varchar(30) default NULL, borrower_since datetime NOT NULL default '0000-00-00 00:00:00', borrower_until datetime NOT NULL default '0000-00-00 00:00:00', notes text, PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS crcILLREQUEST ( id int(15) unsigned NOT NULL auto_increment, id_crcBORROWER int(15) unsigned NOT NULL default '0', barcode varchar(30) NOT NULL default '', period_of_interest_from datetime NOT NULL default '0000-00-00 00:00:00', period_of_interest_to datetime NOT NULL default '0000-00-00 00:00:00', id_crcLIBRARY int(15) unsigned NOT NULL default '0', request_date datetime NOT NULL default '0000-00-00 00:00:00', expected_date datetime NOT NULL default '0000-00-00 00:00:00', arrival_date datetime NOT NULL default '0000-00-00 00:00:00', due_date datetime NOT NULL default '0000-00-00 00:00:00', return_date datetime NOT NULL default '0000-00-00 00:00:00', status varchar(20) NOT NULL default '', cost varchar(30) NOT NULL default '', item_info text, request_type text, borrower_comments text, only_this_edition varchar(10) NOT NULL default '', library_notes text, PRIMARY KEY (id), KEY id_crcborrower (id_crcBORROWER), KEY id_crclibrary (id_crcLIBRARY) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS crcITEM ( barcode varchar(30) NOT NULL default '', id_bibrec int(15) unsigned NOT NULL default '0', id_crcLIBRARY int(15) unsigned NOT NULL default '0', collection varchar(60) default NULL, location varchar(60) default NULL, description varchar(60) default NULL, loan_period varchar(30) NOT NULL default '', status varchar(20) NOT NULL default '', creation_date datetime NOT NULL default '0000-00-00 00:00:00', modification_date datetime NOT NULL default '0000-00-00 00:00:00', number_of_requests int(3) unsigned NOT NULL default '0', PRIMARY KEY (barcode), KEY id_bibrec (id_bibrec), KEY id_crclibrary (id_crcLIBRARY) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS crcLIBRARY ( id int(15) unsigned NOT NULL auto_increment, name varchar(80) NOT NULL default '', address varchar(255) NOT NULL default '', email varchar(255) NOT NULL default '', phone varchar(30) NOT NULL default '', type varchar(30) default NULL, notes text, PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS crcLOAN ( id int(15) unsigned NOT NULL auto_increment, id_crcBORROWER int(15) unsigned NOT NULL default '0', id_bibrec int(15) unsigned NOT NULL default '0', barcode varchar(30) NOT NULL default '', loaned_on datetime NOT NULL default '0000-00-00 00:00:00', returned_on date NOT NULL default '0000-00-00', due_date datetime NOT NULL default '0000-00-00 00:00:00', number_of_renewals int(3) unsigned NOT NULL default '0', overdue_letter_number int(3) unsigned NOT NULL default '0', overdue_letter_date datetime NOT NULL default '0000-00-00 00:00:00', status varchar(20) NOT NULL default '', type varchar(20) NOT NULL default '', notes text, PRIMARY KEY (id), KEY id_crcborrower (id_crcBORROWER), KEY id_bibrec (id_bibrec), KEY barcode (barcode) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS crcLOANREQUEST ( id int(15) unsigned NOT NULL auto_increment, id_crcBORROWER int(15) unsigned NOT NULL default '0', id_bibrec int(15) unsigned NOT NULL default '0', barcode varchar(30) NOT NULL default '', period_of_interest_from datetime NOT NULL default '0000-00-00 00:00:00', period_of_interest_to datetime NOT NULL default '0000-00-00 00:00:00', status varchar(20) NOT NULL default '', notes text, request_date datetime NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (id), KEY id_crcborrower (id_crcBORROWER), KEY id_bibrec (id_bibrec), KEY barcode (barcode) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS crcPURCHASE ( id int(15) unsigned NOT NULL auto_increment, id_bibrec int(15) unsigned NOT NULL default '0', id_crcVENDOR int(15) unsigned NOT NULL default '0', ordered_date datetime NOT NULL default '0000-00-00 00:00:00', expected_date datetime NOT NULL default '0000-00-00 00:00:00', price varchar(20) NOT NULL default '0', status varchar(20) NOT NULL default '', notes text, PRIMARY KEY (id), KEY id_bibrec (id_bibrec), KEY id_crcVENDOR (id_crcVENDOR) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS crcVENDOR ( id int(15) unsigned NOT NULL auto_increment, name varchar(80) NOT NULL default '', address varchar(255) NOT NULL default '', email varchar(255) NOT NULL default '', phone varchar(30) NOT NULL default '', notes text, PRIMARY KEY (id) ) ENGINE=MyISAM; -- BibExport tables: CREATE TABLE IF NOT EXISTS expJOB ( id int(15) unsigned NOT NULL auto_increment, jobname varchar(50) NOT NULL default '', jobfreq mediumint(12) NOT NULL default '0', output_format mediumint(12) NOT NULL default '0', deleted mediumint(12) NOT NULL default '0', lastrun datetime NOT NULL default '0000-00-00 00:00:00', output_directory text, PRIMARY KEY (id), UNIQUE KEY jobname (jobname) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS expQUERY ( id int(15) unsigned NOT NULL auto_increment, name varchar(255) NOT NULL, search_criteria text NOT NULL, output_fields text NOT NULL, notes text, deleted mediumint(12) NOT NULL default '0', PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS expJOB_expQUERY ( id_expJOB int(15) NOT NULL, id_expQUERY int(15) NOT NULL, PRIMARY KEY (id_expJOB,id_expQUERY), KEY id_expJOB (id_expJOB), KEY id_expQUERY (id_expQUERY) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS expQUERYRESULT ( id int(15) unsigned NOT NULL auto_increment, id_expQUERY int(15) NOT NULL, result text NOT NULL, status mediumint(12) NOT NULL default '0', status_message text NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS expJOBRESULT ( id int(15) unsigned NOT NULL auto_increment, id_expJOB int(15) NOT NULL, execution_time datetime NOT NULL default '0000-00-00 00:00:00', status mediumint(12) NOT NULL default '0', status_message text NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS expJOBRESULT_expQUERYRESULT ( id_expJOBRESULT int(15) NOT NULL, id_expQUERYRESULT int(15) NOT NULL, PRIMARY KEY (id_expJOBRESULT, id_expQUERYRESULT), KEY id_expJOBRESULT (id_expJOBRESULT), KEY id_expQUERYRESULT (id_expQUERYRESULT) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS user_expJOB ( id_user int(15) NOT NULL, id_expJOB int(15) NOT NULL, PRIMARY KEY (id_user, id_expJOB), KEY id_user (id_user), KEY id_expJOB (id_expJOB) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS swrREMOTESERVER ( id int(15) unsigned NOT NULL auto_increment, name varchar(50) unique NOT NULL, host varchar(50) NOT NULL, username varchar(50) NOT NULL, password varchar(50) NOT NULL, email varchar(50) NOT NULL, realm varchar(50) NOT NULL, url_base_record varchar(50) NOT NULL, url_servicedocument varchar(80) NOT NULL, xml_servicedocument longblob, last_update int(15) unsigned NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS swrCLIENTDATA ( id int(15) unsigned NOT NULL auto_increment, id_swrREMOTESERVER int(15) NOT NULL, id_record int(15) NOT NULL, report_no varchar(50) NOT NULL, id_remote varchar(50) NOT NULL, id_user int(15) NOT NULL, user_name varchar(100) NOT NULL, user_email varchar(100) NOT NULL, xml_media_deposit longblob NOT NULL, xml_metadata_submit longblob NOT NULL, submission_date datetime NOT NULL default '0000-00-00 00:00:00', publication_date datetime NOT NULL default '0000-00-00 00:00:00', removal_date datetime NOT NULL default '0000-00-00 00:00:00', link_medias varchar(150) NOT NULL, link_metadata varchar(150) NOT NULL, link_status varchar(150) NOT NULL, status varchar(150) NOT NULL default 'submitted', last_update datetime NOT NULL, PRIMARY KEY (id) ) ENGINE=MyISAM; -- tables for exception management -- This table is used to log exceptions -- to discover the full details of an exception either check the email -- that are sent to CFG_SITE_ADMIN_EMAIL or look into invenio.err CREATE TABLE IF NOT EXISTS hstEXCEPTION ( id int(15) unsigned NOT NULL auto_increment, name varchar(50) NOT NULL, -- name of the exception filename varchar(255) NULL, -- file where the exception was raised line int(9) NULL, -- line at which the exception was raised last_seen datetime NOT NULL default '0000-00-00 00:00:00', -- last time this exception has been seen last_notified datetime NOT NULL default '0000-00-00 00:00:00', -- last time this exception has been notified counter int(15) NOT NULL default 0, -- internal counter to decide when to notify this exception total int(15) NOT NULL default 0, -- total number of times this exception has been seen PRIMARY KEY (id), KEY (last_seen), KEY (last_notified), KEY (total), UNIQUE KEY (name(50), filename(255), line) ) ENGINE=MyISAM; -- tables for BibAuthorID module: CREATE TABLE IF NOT EXISTS `aidPERSONIDPAPERS` ( `personid` BIGINT( 16 ) UNSIGNED NOT NULL , `bibref_table` ENUM( '100', '700' ) NOT NULL , `bibref_value` MEDIUMINT( 8 ) UNSIGNED NOT NULL , `bibrec` MEDIUMINT( 8 ) UNSIGNED NOT NULL , `name` VARCHAR( 256 ) NOT NULL , `flag` SMALLINT( 2 ) NOT NULL DEFAULT '0' , `lcul` SMALLINT( 2 ) NOT NULL DEFAULT '0' , `last_updated` TIMESTAMP ON UPDATE CURRENT_TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP , INDEX `personid-b` (`personid`) , INDEX `reftable-b` (`bibref_table`) , INDEX `refvalue-b` (`bibref_value`) , INDEX `rec-b` (`bibrec`) , INDEX `name-b` (`name`) , INDEX `pn-b` (`personid`, `name`) , INDEX `timestamp-b` (`last_updated`) , INDEX `flag-b` (`flag`) , INDEX `ptvrf-b` (`personid`, `bibref_table`, `bibref_value`, `bibrec`, `flag`) ) ENGINE=MYISAM; CREATE TABLE IF NOT EXISTS `aidRESULTS` ( `personid` VARCHAR( 256 ) NOT NULL , `bibref_table` ENUM( '100', '700' ) NOT NULL , `bibref_value` MEDIUMINT( 8 ) UNSIGNED NOT NULL , `bibrec` MEDIUMINT( 8 ) UNSIGNED NOT NULL , INDEX `personid-b` (`personid`) , INDEX `reftable-b` (`bibref_table`) , INDEX `refvalue-b` (`bibref_value`) , INDEX `rec-b` (`bibrec`) ) ENGINE=MYISAM; CREATE TABLE IF NOT EXISTS `aidPERSONIDDATA` ( `personid` BIGINT( 16 ) UNSIGNED NOT NULL , `tag` VARCHAR( 64 ) NOT NULL , `data` VARCHAR( 256 ) NOT NULL , `opt1` MEDIUMINT( 8 ) NULL DEFAULT NULL , `opt2` MEDIUMINT( 8 ) NULL DEFAULT NULL , `opt3` VARCHAR( 256 ) NULL DEFAULT NULL , INDEX `personid-b` (`personid`) , INDEX `tag-b` (`tag`) , INDEX `data-b` (`data`) , INDEX `opt1` (`opt1`) ) ENGINE=MYISAM; CREATE TABLE IF NOT EXISTS `aidUSERINPUTLOG` ( `id` bigint(15) NOT NULL AUTO_INCREMENT, `transactionid` bigint(15) NOT NULL, `timestamp` datetime NOT NULL, `userinfo` varchar(255) NOT NULL, `personid` bigint(15) NOT NULL, `action` varchar(50) NOT NULL, `tag` varchar(50) NOT NULL, `value` varchar(200) NOT NULL, `comment` text, PRIMARY KEY (`id`), INDEX `transactionid-b` (`transactionid`), INDEX `timestamp-b` (`timestamp`), INDEX `userinfo-b` (`userinfo`), INDEX `personid-b` (`personid`), INDEX `action-b` (`action`), INDEX `tag-b` (`tag`), INDEX `value-b` (`value`) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS `aidCACHE` ( `id` int(15) NOT NULL auto_increment, `object_name` varchar(120) NOT NULL, `object_key` varchar(120) NOT NULL, `object_value` text, `last_updated` datetime NOT NULL, PRIMARY KEY (`id`), INDEX `name-b` (`object_name`), INDEX `key-b` (`object_key`), INDEX `last_updated-b` (`last_updated`) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS `aidPROBCACHE` ( `cluster` VARCHAR( 256 ) NOT NULL , `bibmap` MEDIUMBLOB NOT NULL , `matrix` LONGBLOB NOT NULL , PRIMARY KEY ( `cluster` ) ) ENGINE = MYISAM ; -- refextract tables: CREATE TABLE IF NOT EXISTS `xtrJOB` ( `id` tinyint(4) NOT NULL AUTO_INCREMENT, `name` varchar(30) NOT NULL, `last_updated` datetime NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM; -- tables for bibsort module CREATE TABLE IF NOT EXISTS bsrMETHOD ( id mediumint(8) unsigned NOT NULL auto_increment, name varchar(20) NOT NULL, definition varchar(255) NOT NULL, washer varchar(255) NOT NULL, PRIMARY KEY (id), UNIQUE KEY (name) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bsrMETHODNAME ( id_bsrMETHOD mediumint(8) unsigned NOT NULL, ln char(5) NOT NULL default '', type char(3) NOT NULL default 'sn', value varchar(255) NOT NULL, PRIMARY KEY (id_bsrMETHOD, ln, type) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bsrMETHODDATA ( id_bsrMETHOD mediumint(8) unsigned NOT NULL, data_dict longblob, data_dict_ordered longblob, data_list_sorted longblob, last_updated datetime, PRIMARY KEY (id_bsrMETHOD) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS bsrMETHODDATABUCKET ( id_bsrMETHOD mediumint(8) unsigned NOT NULL, bucket_no tinyint(2) NOT NULL, bucket_data longblob, bucket_last_value varchar(255), last_updated datetime, PRIMARY KEY (id_bsrMETHOD, bucket_no) ) ENGINE=MyISAM; CREATE TABLE IF NOT EXISTS collection_bsrMETHOD ( id_collection mediumint(9) unsigned NOT NULL, id_bsrMETHOD mediumint(9) unsigned NOT NULL, score tinyint(4) unsigned NOT NULL default '0', PRIMARY KEY (id_collection, id_bsrMETHOD) ) ENGINE=MyISAM; -- tables for sequence storage CREATE TABLE IF NOT EXISTS seqSTORE ( id int(15) NOT NULL auto_increment, seq_name varchar(15), seq_value varchar(20), PRIMARY KEY (id), UNIQUE KEY seq_name_value (seq_name, seq_value) ) ENGINE=MyISAM; -- end of file diff --git a/modules/miscutil/sql/tabdrop.sql b/modules/miscutil/sql/tabdrop.sql index 1dccb9dba..9eb46ade8 100644 --- a/modules/miscutil/sql/tabdrop.sql +++ b/modules/miscutil/sql/tabdrop.sql @@ -1,468 +1,469 @@ -- $Id$ -- This file is part of Invenio. -- Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN. -- -- Invenio is free software; you can redistribute it and/or -- modify it under the terms of the GNU General Public License as -- published by the Free Software Foundation; either version 2 of the -- License, or (at your option) any later version. -- -- Invenio is distributed in the hope that it will be useful, but -- WITHOUT ANY WARRANTY; without even the implied warranty of -- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -- General Public License for more details. -- -- You should have received a copy of the GNU General Public License -- along with Invenio; if not, write to the Free Software Foundation, Inc., -- 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. DROP TABLE IF EXISTS bibrec; DROP TABLE IF EXISTS bib00x; DROP TABLE IF EXISTS bib01x; DROP TABLE IF EXISTS bib02x; DROP TABLE IF EXISTS bib03x; DROP TABLE IF EXISTS bib04x; DROP TABLE IF EXISTS bib05x; DROP TABLE IF EXISTS bib06x; DROP TABLE IF EXISTS bib07x; DROP TABLE IF EXISTS bib08x; DROP TABLE IF EXISTS bib09x; DROP TABLE IF EXISTS bib10x; DROP TABLE IF EXISTS bib11x; DROP TABLE IF EXISTS bib12x; DROP TABLE IF EXISTS bib13x; DROP TABLE IF EXISTS bib14x; DROP TABLE IF EXISTS bib15x; DROP TABLE IF EXISTS bib16x; DROP TABLE IF EXISTS bib17x; DROP TABLE IF EXISTS bib18x; DROP TABLE IF EXISTS bib19x; DROP TABLE IF EXISTS bib20x; DROP TABLE IF EXISTS bib21x; DROP TABLE IF EXISTS bib22x; DROP TABLE IF EXISTS bib23x; DROP TABLE IF EXISTS bib24x; DROP TABLE IF EXISTS bib25x; DROP TABLE IF EXISTS bib26x; DROP TABLE IF EXISTS bib27x; DROP TABLE IF EXISTS bib28x; DROP TABLE IF EXISTS bib29x; DROP TABLE IF EXISTS bib30x; DROP TABLE IF EXISTS bib31x; DROP TABLE IF EXISTS bib32x; DROP TABLE IF EXISTS bib33x; DROP TABLE IF EXISTS bib34x; DROP TABLE IF EXISTS bib35x; DROP TABLE IF EXISTS bib36x; DROP TABLE IF EXISTS bib37x; DROP TABLE IF EXISTS bib38x; DROP TABLE IF EXISTS bib39x; DROP TABLE IF EXISTS bib40x; DROP TABLE IF EXISTS bib41x; DROP TABLE IF EXISTS bib42x; DROP TABLE IF EXISTS bib43x; DROP TABLE IF EXISTS bib44x; DROP TABLE IF EXISTS bib45x; DROP TABLE IF EXISTS bib46x; DROP TABLE IF EXISTS bib47x; DROP TABLE IF EXISTS bib48x; DROP TABLE IF EXISTS bib49x; DROP TABLE IF EXISTS bib50x; DROP TABLE IF EXISTS bib51x; DROP TABLE IF EXISTS bib52x; DROP TABLE IF EXISTS bib53x; DROP TABLE IF EXISTS bib54x; DROP TABLE IF EXISTS bib55x; DROP TABLE IF EXISTS bib56x; DROP TABLE IF EXISTS bib57x; DROP TABLE IF EXISTS bib58x; DROP TABLE IF EXISTS bib59x; DROP TABLE IF EXISTS bib60x; DROP TABLE IF EXISTS bib61x; DROP TABLE IF EXISTS bib62x; DROP TABLE IF EXISTS bib63x; DROP TABLE IF EXISTS bib64x; DROP TABLE IF EXISTS bib65x; DROP TABLE IF EXISTS bib66x; DROP TABLE IF EXISTS bib67x; DROP TABLE IF EXISTS bib68x; DROP TABLE IF EXISTS bib69x; DROP TABLE IF EXISTS bib70x; DROP TABLE IF EXISTS bib71x; DROP TABLE IF EXISTS bib72x; DROP TABLE IF EXISTS bib73x; DROP TABLE IF EXISTS bib74x; DROP TABLE IF EXISTS bib75x; DROP TABLE IF EXISTS bib76x; DROP TABLE IF EXISTS bib77x; DROP TABLE IF EXISTS bib78x; DROP TABLE IF EXISTS bib79x; DROP TABLE IF EXISTS bib80x; DROP TABLE IF EXISTS bib81x; DROP TABLE IF EXISTS bib82x; DROP TABLE IF EXISTS bib83x; DROP TABLE IF EXISTS bib84x; DROP TABLE IF EXISTS bib85x; DROP TABLE IF EXISTS bib86x; DROP TABLE IF EXISTS bib87x; DROP TABLE IF EXISTS bib88x; DROP TABLE IF EXISTS bib89x; DROP TABLE IF EXISTS bib90x; DROP TABLE IF EXISTS bib91x; DROP TABLE IF EXISTS bib92x; DROP TABLE IF EXISTS bib93x; DROP TABLE IF EXISTS bib94x; DROP TABLE IF EXISTS bib95x; DROP TABLE IF EXISTS bib96x; DROP TABLE IF EXISTS bib97x; DROP TABLE IF EXISTS bib98x; DROP TABLE IF EXISTS bib99x; DROP TABLE IF EXISTS bibrec_bib00x; DROP TABLE IF EXISTS bibrec_bib01x; DROP TABLE IF EXISTS bibrec_bib02x; DROP TABLE IF EXISTS bibrec_bib03x; DROP TABLE IF EXISTS bibrec_bib04x; DROP TABLE IF EXISTS bibrec_bib05x; DROP TABLE IF EXISTS bibrec_bib06x; DROP TABLE IF EXISTS bibrec_bib07x; DROP TABLE IF EXISTS bibrec_bib08x; DROP TABLE IF EXISTS bibrec_bib09x; DROP TABLE IF EXISTS bibrec_bib10x; DROP TABLE IF EXISTS bibrec_bib11x; DROP TABLE IF EXISTS bibrec_bib12x; DROP TABLE IF EXISTS bibrec_bib13x; DROP TABLE IF EXISTS bibrec_bib14x; DROP TABLE IF EXISTS bibrec_bib15x; DROP TABLE IF EXISTS bibrec_bib16x; DROP TABLE IF EXISTS bibrec_bib17x; DROP TABLE IF EXISTS bibrec_bib18x; DROP TABLE IF EXISTS bibrec_bib19x; DROP TABLE IF EXISTS bibrec_bib20x; DROP TABLE IF EXISTS bibrec_bib21x; DROP TABLE IF EXISTS bibrec_bib22x; DROP TABLE IF EXISTS bibrec_bib23x; DROP TABLE IF EXISTS bibrec_bib24x; DROP TABLE IF EXISTS bibrec_bib25x; DROP TABLE IF EXISTS bibrec_bib26x; DROP TABLE IF EXISTS bibrec_bib27x; DROP TABLE IF EXISTS bibrec_bib28x; DROP TABLE IF EXISTS bibrec_bib29x; DROP TABLE IF EXISTS bibrec_bib30x; DROP TABLE IF EXISTS bibrec_bib31x; DROP TABLE IF EXISTS bibrec_bib32x; DROP TABLE IF EXISTS bibrec_bib33x; DROP TABLE IF EXISTS bibrec_bib34x; DROP TABLE IF EXISTS bibrec_bib35x; DROP TABLE IF EXISTS bibrec_bib36x; DROP TABLE IF EXISTS bibrec_bib37x; DROP TABLE IF EXISTS bibrec_bib38x; DROP TABLE IF EXISTS bibrec_bib39x; DROP TABLE IF EXISTS bibrec_bib40x; DROP TABLE IF EXISTS bibrec_bib41x; DROP TABLE IF EXISTS bibrec_bib42x; DROP TABLE IF EXISTS bibrec_bib43x; DROP TABLE IF EXISTS bibrec_bib44x; DROP TABLE IF EXISTS bibrec_bib45x; DROP TABLE IF EXISTS bibrec_bib46x; DROP TABLE IF EXISTS bibrec_bib47x; DROP TABLE IF EXISTS bibrec_bib48x; DROP TABLE IF EXISTS bibrec_bib49x; DROP TABLE IF EXISTS bibrec_bib50x; DROP TABLE IF EXISTS bibrec_bib51x; DROP TABLE IF EXISTS bibrec_bib52x; DROP TABLE IF EXISTS bibrec_bib53x; DROP TABLE IF EXISTS bibrec_bib54x; DROP TABLE IF EXISTS bibrec_bib55x; DROP TABLE IF EXISTS bibrec_bib56x; DROP TABLE IF EXISTS bibrec_bib57x; DROP TABLE IF EXISTS bibrec_bib58x; DROP TABLE IF EXISTS bibrec_bib59x; DROP TABLE IF EXISTS bibrec_bib60x; DROP TABLE IF EXISTS bibrec_bib61x; DROP TABLE IF EXISTS bibrec_bib62x; DROP TABLE IF EXISTS bibrec_bib63x; DROP TABLE IF EXISTS bibrec_bib64x; DROP TABLE IF EXISTS bibrec_bib65x; DROP TABLE IF EXISTS bibrec_bib66x; DROP TABLE IF EXISTS bibrec_bib67x; DROP TABLE IF EXISTS bibrec_bib68x; DROP TABLE IF EXISTS bibrec_bib69x; DROP TABLE IF EXISTS bibrec_bib70x; DROP TABLE IF EXISTS bibrec_bib71x; DROP TABLE IF EXISTS bibrec_bib72x; DROP TABLE IF EXISTS bibrec_bib73x; DROP TABLE IF EXISTS bibrec_bib74x; DROP TABLE IF EXISTS bibrec_bib75x; DROP TABLE IF EXISTS bibrec_bib76x; DROP TABLE IF EXISTS bibrec_bib77x; DROP TABLE IF EXISTS bibrec_bib78x; DROP TABLE IF EXISTS bibrec_bib79x; DROP TABLE IF EXISTS bibrec_bib80x; DROP TABLE IF EXISTS bibrec_bib81x; DROP TABLE IF EXISTS bibrec_bib82x; DROP TABLE IF EXISTS bibrec_bib83x; DROP TABLE IF EXISTS bibrec_bib84x; DROP TABLE IF EXISTS bibrec_bib85x; DROP TABLE IF EXISTS bibrec_bib86x; DROP TABLE IF EXISTS bibrec_bib87x; DROP TABLE IF EXISTS bibrec_bib88x; DROP TABLE IF EXISTS bibrec_bib89x; DROP TABLE IF EXISTS bibrec_bib90x; DROP TABLE IF EXISTS bibrec_bib91x; DROP TABLE IF EXISTS bibrec_bib92x; DROP TABLE IF EXISTS bibrec_bib93x; DROP TABLE IF EXISTS bibrec_bib94x; DROP TABLE IF EXISTS bibrec_bib95x; DROP TABLE IF EXISTS bibrec_bib96x; DROP TABLE IF EXISTS bibrec_bib97x; DROP TABLE IF EXISTS bibrec_bib98x; DROP TABLE IF EXISTS bibrec_bib99x; DROP TABLE IF EXISTS bibfmt; DROP TABLE IF EXISTS idxINDEX; DROP TABLE IF EXISTS idxINDEXNAME; DROP TABLE IF EXISTS idxINDEX_field; DROP TABLE IF EXISTS idxWORD01F; DROP TABLE IF EXISTS idxWORD02F; DROP TABLE IF EXISTS idxWORD03F; DROP TABLE IF EXISTS idxWORD04F; DROP TABLE IF EXISTS idxWORD05F; DROP TABLE IF EXISTS idxWORD06F; DROP TABLE IF EXISTS idxWORD07F; DROP TABLE IF EXISTS idxWORD08F; DROP TABLE IF EXISTS idxWORD09F; DROP TABLE IF EXISTS idxWORD10F; DROP TABLE IF EXISTS idxWORD11F; DROP TABLE IF EXISTS idxWORD12F; DROP TABLE IF EXISTS idxWORD13F; DROP TABLE IF EXISTS idxWORD14F; DROP TABLE IF EXISTS idxWORD15F; DROP TABLE IF EXISTS idxWORD16F; DROP TABLE IF EXISTS idxWORD17F; DROP TABLE IF EXISTS idxWORD18F; DROP TABLE IF EXISTS idxWORD01R; DROP TABLE IF EXISTS idxWORD02R; DROP TABLE IF EXISTS idxWORD03R; DROP TABLE IF EXISTS idxWORD04R; DROP TABLE IF EXISTS idxWORD05R; DROP TABLE IF EXISTS idxWORD06R; DROP TABLE IF EXISTS idxWORD07R; DROP TABLE IF EXISTS idxWORD08R; DROP TABLE IF EXISTS idxWORD09R; DROP TABLE IF EXISTS idxWORD10R; DROP TABLE IF EXISTS idxWORD11R; DROP TABLE IF EXISTS idxWORD12R; DROP TABLE IF EXISTS idxWORD13R; DROP TABLE IF EXISTS idxWORD14R; DROP TABLE IF EXISTS idxWORD15R; DROP TABLE IF EXISTS idxWORD16R; DROP TABLE IF EXISTS idxWORD17R; DROP TABLE IF EXISTS idxWORD18R; DROP TABLE IF EXISTS idxPAIR01F; DROP TABLE IF EXISTS idxPAIR02F; DROP TABLE IF EXISTS idxPAIR03F; DROP TABLE IF EXISTS idxPAIR04F; DROP TABLE IF EXISTS idxPAIR05F; DROP TABLE IF EXISTS idxPAIR06F; DROP TABLE IF EXISTS idxPAIR07F; DROP TABLE IF EXISTS idxPAIR08F; DROP TABLE IF EXISTS idxPAIR09F; DROP TABLE IF EXISTS idxPAIR10F; DROP TABLE IF EXISTS idxPAIR11F; DROP TABLE IF EXISTS idxPAIR12F; DROP TABLE IF EXISTS idxPAIR13F; DROP TABLE IF EXISTS idxPAIR14F; DROP TABLE IF EXISTS idxPAIR15F; DROP TABLE IF EXISTS idxPAIR16F; DROP TABLE IF EXISTS idxPAIR17F; DROP TABLE IF EXISTS idxPAIR18F; DROP TABLE IF EXISTS idxPAIR01R; DROP TABLE IF EXISTS idxPAIR02R; DROP TABLE IF EXISTS idxPAIR03R; DROP TABLE IF EXISTS idxPAIR04R; DROP TABLE IF EXISTS idxPAIR05R; DROP TABLE IF EXISTS idxPAIR06R; DROP TABLE IF EXISTS idxPAIR07R; DROP TABLE IF EXISTS idxPAIR08R; DROP TABLE IF EXISTS idxPAIR09R; DROP TABLE IF EXISTS idxPAIR10R; DROP TABLE IF EXISTS idxPAIR11R; DROP TABLE IF EXISTS idxPAIR12R; DROP TABLE IF EXISTS idxPAIR13R; DROP TABLE IF EXISTS idxPAIR14R; DROP TABLE IF EXISTS idxPAIR15R; DROP TABLE IF EXISTS idxPAIR16R; DROP TABLE IF EXISTS idxPAIR17R; DROP TABLE IF EXISTS idxPAIR18R; DROP TABLE IF EXISTS idxPHRASE01F; DROP TABLE IF EXISTS idxPHRASE02F; DROP TABLE IF EXISTS idxPHRASE03F; DROP TABLE IF EXISTS idxPHRASE04F; DROP TABLE IF EXISTS idxPHRASE05F; DROP TABLE IF EXISTS idxPHRASE06F; DROP TABLE IF EXISTS idxPHRASE07F; DROP TABLE IF EXISTS idxPHRASE08F; DROP TABLE IF EXISTS idxPHRASE09F; DROP TABLE IF EXISTS idxPHRASE10F; DROP TABLE IF EXISTS idxPHRASE11F; DROP TABLE IF EXISTS idxPHRASE12F; DROP TABLE IF EXISTS idxPHRASE13F; DROP TABLE IF EXISTS idxPHRASE14F; DROP TABLE IF EXISTS idxPHRASE15F; DROP TABLE IF EXISTS idxPHRASE16F; DROP TABLE IF EXISTS idxPHRASE17F; DROP TABLE IF EXISTS idxPHRASE18F; DROP TABLE IF EXISTS idxPHRASE01R; DROP TABLE IF EXISTS idxPHRASE02R; DROP TABLE IF EXISTS idxPHRASE03R; DROP TABLE IF EXISTS idxPHRASE04R; DROP TABLE IF EXISTS idxPHRASE05R; DROP TABLE IF EXISTS idxPHRASE06R; DROP TABLE IF EXISTS idxPHRASE07R; DROP TABLE IF EXISTS idxPHRASE08R; DROP TABLE IF EXISTS idxPHRASE09R; DROP TABLE IF EXISTS idxPHRASE10R; DROP TABLE IF EXISTS idxPHRASE11R; DROP TABLE IF EXISTS idxPHRASE12R; DROP TABLE IF EXISTS idxPHRASE13R; DROP TABLE IF EXISTS idxPHRASE14R; DROP TABLE IF EXISTS idxPHRASE15R; DROP TABLE IF EXISTS idxPHRASE16R; DROP TABLE IF EXISTS idxPHRASE17R; DROP TABLE IF EXISTS idxPHRASE18R; DROP TABLE IF EXISTS rnkMETHOD; DROP TABLE IF EXISTS rnkMETHODNAME; DROP TABLE IF EXISTS rnkMETHODDATA; DROP TABLE IF EXISTS rnkWORD01F; DROP TABLE IF EXISTS rnkWORD01R; DROP TABLE IF EXISTS rnkPAGEVIEWS; DROP TABLE IF EXISTS rnkDOWNLOADS; DROP TABLE IF EXISTS rnkCITATIONDATA; DROP TABLE IF EXISTS rnkCITATIONDATAEXT; DROP TABLE IF EXISTS rnkAUTHORDATA; DROP TABLE IF EXISTS collection_rnkMETHOD; DROP TABLE IF EXISTS collection; DROP TABLE IF EXISTS collectionname; DROP TABLE IF EXISTS oaiREPOSITORY; DROP TABLE IF EXISTS oaiHARVEST; DROP TABLE IF EXISTS oaiHARVESTLOG; DROP TABLE IF EXISTS bibHOLDINGPEN; DROP TABLE IF EXISTS collection_collection; DROP TABLE IF EXISTS collection_portalbox; DROP TABLE IF EXISTS portalbox; DROP TABLE IF EXISTS collection_example; DROP TABLE IF EXISTS example; DROP TABLE IF EXISTS collection_format; DROP TABLE IF EXISTS format; DROP TABLE IF EXISTS formatname; DROP TABLE IF EXISTS collection_field_fieldvalue; DROP TABLE IF EXISTS field; DROP TABLE IF EXISTS fieldname; DROP TABLE IF EXISTS fieldvalue; DROP TABLE IF EXISTS field_tag; DROP TABLE IF EXISTS tag; DROP TABLE IF EXISTS publreq; DROP TABLE IF EXISTS session; DROP TABLE IF EXISTS user; DROP TABLE IF EXISTS accROLE; DROP TABLE IF EXISTS accMAILCOOKIE; DROP TABLE IF EXISTS user_accROLE; DROP TABLE IF EXISTS accACTION; DROP TABLE IF EXISTS accARGUMENT; DROP TABLE IF EXISTS accROLE_accACTION_accARGUMENT; DROP TABLE IF EXISTS user_query; DROP TABLE IF EXISTS query; DROP TABLE IF EXISTS user_basket; DROP TABLE IF EXISTS basket; DROP TABLE IF EXISTS basket_record; DROP TABLE IF EXISTS record; DROP TABLE IF EXISTS user_query_basket; DROP TABLE IF EXISTS cmtRECORDCOMMENT; DROP TABLE IF EXISTS knwKB; DROP TABLE IF EXISTS knwKBRVAL; DROP TABLE IF EXISTS knwKBDDEF; DROP TABLE IF EXISTS sbmACTION; DROP TABLE IF EXISTS sbmALLFUNCDESCR; DROP TABLE IF EXISTS sbmAPPROVAL; DROP TABLE IF EXISTS sbmCPLXAPPROVAL; DROP TABLE IF EXISTS sbmCOLLECTION; DROP TABLE IF EXISTS sbmCOLLECTION_sbmCOLLECTION; DROP TABLE IF EXISTS sbmCOLLECTION_sbmDOCTYPE; DROP TABLE IF EXISTS sbmCATEGORIES; DROP TABLE IF EXISTS sbmCHECKS; DROP TABLE IF EXISTS sbmCOOKIES; DROP TABLE IF EXISTS sbmDOCTYPE; DROP TABLE IF EXISTS sbmFIELD; DROP TABLE IF EXISTS sbmFIELDDESC; DROP TABLE IF EXISTS sbmFORMATEXTENSION; DROP TABLE IF EXISTS sbmFUNCTIONS; DROP TABLE IF EXISTS sbmFUNDESC; DROP TABLE IF EXISTS sbmGFILERESULT; DROP TABLE IF EXISTS sbmIMPLEMENT; DROP TABLE IF EXISTS sbmPARAMETERS; DROP TABLE IF EXISTS sbmPUBLICATION; DROP TABLE IF EXISTS sbmPUBLICATIONCOMM; DROP TABLE IF EXISTS sbmPUBLICATIONDATA; DROP TABLE IF EXISTS sbmREFEREES; DROP TABLE IF EXISTS sbmSUBMISSIONS; DROP TABLE IF EXISTS schTASK; DROP TABLE IF EXISTS bibdoc; DROP TABLE IF EXISTS bibdoc_bibdoc; DROP TABLE IF EXISTS bibrec_bibdoc; +DROP TABLE IF EXISTS bibdocfsinfo; DROP TABLE IF EXISTS usergroup; DROP TABLE IF EXISTS user_usergroup; DROP TABLE IF EXISTS user_basket; DROP TABLE IF EXISTS msgMESSAGE; DROP TABLE IF EXISTS user_msgMESSAGE; DROP TABLE IF EXISTS bskBASKET; DROP TABLE IF EXISTS bskEXTREC; DROP TABLE IF EXISTS bskEXTFMT; DROP TABLE IF EXISTS bskREC; DROP TABLE IF EXISTS bskRECORDCOMMENT; DROP TABLE IF EXISTS cmtACTIONHISTORY; DROP TABLE IF EXISTS cmtSUBSCRIPTION; DROP TABLE IF EXISTS user_bskBASKET; DROP TABLE IF EXISTS usergroup_bskBASKET; DROP TABLE IF EXISTS collection_externalcollection; DROP TABLE IF EXISTS externalcollection; DROP TABLE IF EXISTS collectiondetailedrecordpagetabs; DROP TABLE IF EXISTS staEVENT; DROP TABLE IF EXISTS clsMETHOD; DROP TABLE IF EXISTS collection_clsMETHOD; DROP TABLE IF EXISTS jrnJOURNAL; DROP TABLE IF EXISTS jrnISSUE; DROP TABLE IF EXISTS hstRECORD; DROP TABLE IF EXISTS hstDOCUMENT; DROP TABLE IF EXISTS hstTASK; DROP TABLE IF EXISTS hstBATCHUPLOAD; DROP TABLE IF EXISTS crcBORROWER; DROP TABLE IF EXISTS crcILLREQUEST; DROP TABLE IF EXISTS crcITEM; DROP TABLE IF EXISTS crcLIBRARY; DROP TABLE IF EXISTS crcLOAN; DROP TABLE IF EXISTS crcLOANREQUEST; DROP TABLE IF EXISTS crcPURCHASE; DROP TABLE IF EXISTS crcVENDOR; DROP TABLE IF EXISTS expJOB; DROP TABLE IF EXISTS expQUERY; DROP TABLE IF EXISTS expJOB_expQUERY; DROP TABLE IF EXISTS expQUERYRESULT; DROP TABLE IF EXISTS expJOBRESULT; DROP TABLE IF EXISTS expJOBRESULT_expQUERYRESULT; DROP TABLE IF EXISTS user_expJOB; DROP TABLE IF EXISTS swrREMOTESERVER; DROP TABLE IF EXISTS swrCLIENTDATA; DROP TABLE IF EXISTS hstEXCEPTION; DROP TABLE IF EXISTS aidUSERINPUTLOG; DROP TABLE IF EXISTS aidCACHE; DROP TABLE IF EXISTS aidPERSONIDDATA; DROP TABLE IF EXISTS aidPERSONIDPAPERS; DROP TABLE IF EXISTS aidRESULTS; DROP TABLE IF EXISTS aidPROBCACHE; DROP TABLE IF EXISTS xtrJOB; DROP TABLE IF EXISTS bsrMETHOD; DROP TABLE IF EXISTS bsrMETHODNAME; DROP TABLE IF EXISTS bsrMETHODDATA; DROP TABLE IF EXISTS bsrMETHODDATABUCKET; DROP TABLE IF EXISTS collection_bsrMETHOD; -- end of file diff --git a/modules/websubmit/lib/bibdocfile.py b/modules/websubmit/lib/bibdocfile.py index 7d28dbdac..562ac0609 100644 --- a/modules/websubmit/lib/bibdocfile.py +++ b/modules/websubmit/lib/bibdocfile.py @@ -1,3991 +1,4007 @@ ## This file is part of Invenio. ## Copyright (C) 2007, 2008, 2009, 2010, 2011, 2012 CERN. ## ## Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ This module implements the low-level API for dealing with fulltext files. - All the files associated to a I{record} (identified by a I{recid}) can be managed via an instance of the C{BibRecDocs} class. - A C{BibRecDocs} is a wrapper of the list of I{documents} attached to the record. - Each document is represented by an instance of the C{BibDoc} class. - A document is identified by a C{docid} and name (C{docname}). The docname must be unique within the record. A document is the set of all the formats and revisions of a piece of information. - A document has a type called C{doctype} and can have a restriction. - Each physical file, i.e. the concretization of a document into a particular I{version} and I{format} is represented by an instance of the C{BibDocFile} class. - The format is infact the extension of the physical file. - A comment and a description and other information can be associated to a BibDocFile. - A C{bibdoc} is a synonim for a document, while a C{bibdocfile} is a synonim for a physical file. @group Main classes: BibRecDocs,BibDoc,BibDocFile @group Other classes: BibDocMoreInfo,Md5Folder,InvenioWebSubmitFileError @group Main functions: decompose_file,stream_file,bibdocfile_*,download_url @group Configuration Variables: CFG_* """ __revision__ = "$Id$" import os import re import shutil import filecmp import time import random import socket import urllib2 import urllib import tempfile import cPickle import base64 import binascii import cgi import sys from warnings import warn if sys.hexversion < 0x2060000: from md5 import md5 else: from hashlib import md5 try: import magic if not hasattr(magic, "open"): raise ImportError CFG_HAS_MAGIC = True except ImportError: CFG_HAS_MAGIC = False ## The above flag controls whether HTTP range requests are supported or not ## when serving static files via Python. This is disabled by default as ## it currently breaks support for opening PDF files on Windows platforms ## using Acrobat reader brower plugin. CFG_ENABLE_HTTP_RANGE_REQUESTS = False from datetime import datetime from mimetypes import MimeTypes from thread import get_ident from invenio import webinterface_handler_config as apache ## Let's set a reasonable timeout for URL request (e.g. FFT) socket.setdefaulttimeout(40) if sys.hexversion < 0x2040000: # pylint: disable=W0622 from sets import Set as set # pylint: enable=W0622 from invenio.shellutils import escape_shell_arg from invenio.dbquery import run_sql, DatabaseError, blob_to_string from invenio.errorlib import register_exception from invenio.bibrecord import record_get_field_instances, \ field_get_subfield_values, field_get_subfield_instances, \ encode_for_xml from invenio.urlutils import create_url from invenio.textutils import nice_size from invenio.access_control_engine import acc_authorize_action from invenio.webuser import collect_user_info from invenio.access_control_admin import acc_is_user_in_role, acc_get_role_id from invenio.access_control_firerole import compile_role_definition, acc_firerole_check_user from invenio.access_control_config import SUPERADMINROLE, CFG_WEBACCESS_WARNING_MSGS from invenio.config import CFG_SITE_LANG, CFG_SITE_URL, \ CFG_WEBDIR, CFG_WEBSUBMIT_FILEDIR,\ CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS, \ CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT, CFG_SITE_SECURE_URL, \ CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS, \ CFG_TMPDIR, CFG_TMPSHAREDDIR, CFG_PATH_MD5SUM, \ CFG_WEBSUBMIT_STORAGEDIR, \ CFG_BIBDOCFILE_USE_XSENDFILE, \ CFG_BIBDOCFILE_MD5_CHECK_PROBABILITY, \ CFG_SITE_RECORD, \ - CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS + CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS, \ + CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE from invenio.websubmit_config import CFG_WEBSUBMIT_ICON_SUBFORMAT_RE, \ CFG_WEBSUBMIT_DEFAULT_ICON_SUBFORMAT import invenio.template websubmit_templates = invenio.template.load('websubmit') websearch_templates = invenio.template.load('websearch') #: block size when performing I/O. CFG_BIBDOCFILE_BLOCK_SIZE = 1024 * 8 #: threshold used do decide when to use Python MD5 of CLI MD5 algorithm. CFG_BIBDOCFILE_MD5_THRESHOLD = 256 * 1024 #: chunks loaded by the Python MD5 algorithm. CFG_BIBDOCFILE_MD5_BUFFER = 1024 * 1024 #: whether to normalize e.g. ".JPEG" and ".jpg" into .jpeg. CFG_BIBDOCFILE_STRONG_FORMAT_NORMALIZATION = False #: flags that can be associated to files. CFG_BIBDOCFILE_AVAILABLE_FLAGS = ( 'PDF/A', 'STAMPED', 'PDFOPT', 'HIDDEN', 'CONVERTED', 'PERFORM_HIDE_PREVIOUS', 'OCRED' ) #: constant used if FFT correct with the obvious meaning. KEEP_OLD_VALUE = 'KEEP-OLD-VALUE' _CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS = [(re.compile(_regex), _headers) for _regex, _headers in CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS] _mimes = MimeTypes(strict=False) _mimes.suffix_map.update({'.tbz2' : '.tar.bz2'}) _mimes.encodings_map.update({'.bz2' : 'bzip2'}) _magic_cookies = {} def _get_magic_cookies(): """ @return: a tuple of magic object. @rtype: (MAGIC_NONE, MAGIC_COMPRESS, MAGIC_MIME, MAGIC_COMPRESS + MAGIC_MIME) @note: ... not real magic. Just see: man file(1) """ thread_id = get_ident() if thread_id not in _magic_cookies: _magic_cookies[thread_id] = { magic.MAGIC_NONE : magic.open(magic.MAGIC_NONE), magic.MAGIC_COMPRESS : magic.open(magic.MAGIC_COMPRESS), magic.MAGIC_MIME : magic.open(magic.MAGIC_MIME), magic.MAGIC_COMPRESS + magic.MAGIC_MIME : magic.open(magic.MAGIC_COMPRESS + magic.MAGIC_MIME) } for key in _magic_cookies[thread_id].keys(): _magic_cookies[thread_id][key].load() return _magic_cookies[thread_id] def _generate_extensions(): """ Generate the regular expression to match all the known extensions. @return: the regular expression. @rtype: regular expression object """ _tmp_extensions = _mimes.encodings_map.keys() + \ _mimes.suffix_map.keys() + \ _mimes.types_map[1].keys() + \ CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS extensions = [] for ext in _tmp_extensions: if ext.startswith('.'): extensions.append(ext) else: extensions.append('.' + ext) extensions.sort() extensions.reverse() extensions = set([ext.lower() for ext in extensions]) extensions = '\\' + '$|\\'.join(extensions) + '$' extensions = extensions.replace('+', '\\+') return re.compile(extensions, re.I) #: Regular expression to recognized extensions. _extensions = _generate_extensions() class InvenioWebSubmitFileError(Exception): """ Exception raised in case of errors related to fulltext files. """ pass class InvenioBibdocfileUnauthorizedURL(Exception): """ Exception raised when one tries to download an unauthorized external URL. """ pass def file_strip_ext(afile, skip_version=False, only_known_extensions=False, allow_subformat=True): """ Strip in the best way the extension from a filename. >>> file_strip_ext("foo.tar.gz") 'foo' >>> file_strip_ext("foo.buz.gz") 'foo.buz' >>> file_strip_ext("foo.buz") 'foo' >>> file_strip_ext("foo.buz", only_known_extensions=True) 'foo.buz' >>> file_strip_ext("foo.buz;1", skip_version=False, ... only_known_extensions=True) 'foo.buz;1' >>> file_strip_ext("foo.gif;icon") 'foo' >>> file_strip_ext("foo.gif:icon", allow_subformat=False) 'foo.gif:icon' @param afile: the path/name of a file. @type afile: string @param skip_version: whether to skip a trailing ";version". @type skip_version: bool @param only_known_extensions: whether to strip out only known extensions or to consider as extension anything that follows a dot. @type only_known_extensions: bool @param allow_subformat: whether to consider also subformats as part of the extension. @type allow_subformat: bool @return: the name/path without the extension (and version). @rtype: string """ if skip_version or allow_subformat: afile = afile.split(';')[0] nextfile = _extensions.sub('', afile) if nextfile == afile and not only_known_extensions: nextfile = os.path.splitext(afile)[0] while nextfile != afile: afile = nextfile nextfile = _extensions.sub('', afile) return nextfile def normalize_format(format, allow_subformat=True): """ Normalize the format, e.g. by adding a dot in front. @param format: the format/extension to be normalized. @type format: string @param allow_subformat: whether to consider also subformats as part of the extension. @type allow_subformat: bool @return: the normalized format. @rtype; string """ if allow_subformat: subformat = format[format.rfind(';'):] format = format[:format.rfind(';')] else: subformat = '' if format and format[0] != '.': format = '.' + format if CFG_BIBDOCFILE_STRONG_FORMAT_NORMALIZATION: if format not in ('.Z', '.H', '.C', '.CC'): format = format.lower() format = { '.jpg' : '.jpeg', '.htm' : '.html', '.tif' : '.tiff' }.get(format, format) return format + subformat def guess_format_from_url(url): """ Given a URL tries to guess it's extension. Different method will be used, including HTTP HEAD query, downloading the resource and using mime @param url: the URL for which the extension shuld be guessed. @type url: string @return: the recognized extension or empty string if it's impossible to recognize it. @rtype: string """ ## Let's try to guess the extension by considering the URL as a filename ext = decompose_file(url, skip_version=True, only_known_extensions=True)[2] if ext.startswith('.'): return ext if is_url_a_local_file(url) and CFG_HAS_MAGIC: ## if the URL corresponds to a local file, let's try to use ## the Python magic library to guess it try: magic_cookie = _get_magic_cookies()[magic.MAGIC_MIME] mimetype = magic_cookie.file(url) ext = _mimes.guess_extension(mimetype) if ext: return normalize_format(ext) except Exception: pass else: ## Since the URL is remote, let's try to perform a HEAD request ## and see the corresponding headers try: response = open_url(url, head_request=True) except (InvenioBibdocfileUnauthorizedURL, urllib2.URLError): return "" format = get_format_from_http_response(response) if format: return format if CFG_HAS_MAGIC: ## Last solution: let's download the remote resource ## and use the Python magic library to guess the extension try: filename = download_url(url, format='') magic_cookie = _get_magic_cookies()[magic.MAGIC_MIME] mimetype = magic_cookie.file(filename) os.remove(filename) ext = _mimes.guess_extension(mimetype) if ext: return normalize_format(ext) except Exception: pass return "" _docname_re = re.compile(r'[^-\w.]*') def normalize_docname(docname): """ Normalize the docname. At the moment the normalization is just returning the same string. @param docname: the docname to be normalized. @type docname: string @return: the normalized docname. @rtype: string """ #return _docname_re.sub('', docname) return docname def normalize_version(version): """ Normalize the version. The version can be either an integer or the keyword 'all'. Any other value will be transformed into the empty string. @param version: the version (either a number or 'all'). @type version: integer or string @return: the normalized version. @rtype: string """ try: int(version) except ValueError: if version.lower().strip() == 'all': return 'all' else: return '' return str(version) def compose_file(dirname, docname, extension, subformat=None, version=None): """ Construct back a fullpath given the separate components. """ if version: version = ";%i" % int(version) else: version = "" if subformat: if not subformat.startswith(";"): subformat = ";%s" % subformat else: subformat = "" if extension and not extension.startswith("."): extension = ".%s" % extension return os.path.join(dirname, docname + extension + subformat + version) def compose_format(extension, subformat=None): """ Construct the format string """ if not extension.startswith("."): extension = ".%s" % extension if subformat: if not subformat.startswith(";"): subformat = ";%s" % subformat else: subformat = "" return extension + subformat def decompose_file(afile, skip_version=False, only_known_extensions=False, allow_subformat=True): """ Decompose a file/path into its components dirname, basename and extension. >>> decompose_file('/tmp/foo.tar.gz') ('/tmp', 'foo', '.tar.gz') >>> decompose_file('/tmp/foo.tar.gz;1', skip_version=True) ('/tmp', 'foo', '.tar.gz') >>> decompose_file('http://www.google.com/index.html') ('http://www.google.com', 'index', '.html') @param afile: the path/name of a file. @type afile: string @param skip_version: whether to skip a trailing ";version". @type skip_version: bool @param only_known_extensions: whether to strip out only known extensions or to consider as extension anything that follows a dot. @type only_known_extensions: bool @param allow_subformat: whether to consider also subformats as part of the extension. @type allow_subformat: bool @return: a tuple with the directory name, the docname and extension. @rtype: (dirname, docname, extension) @note: if a URL is provided, the scheme will be part of the dirname. @see: L{file_strip_ext} for the algorithm used to retrieve the extension. """ if skip_version: version = afile.split(';')[-1] try: int(version) afile = afile[:-len(version)-1] except ValueError: pass basename = os.path.basename(afile) dirname = afile[:-len(basename)-1] base = file_strip_ext( basename, only_known_extensions=only_known_extensions, allow_subformat=allow_subformat) extension = basename[len(base) + 1:] if extension: extension = '.' + extension return (dirname, base, extension) def decompose_file_with_version(afile): """ Decompose a file into dirname, basename, extension and version. >>> decompose_file_with_version('/tmp/foo.tar.gz;1') ('/tmp', 'foo', '.tar.gz', 1) @param afile: the path/name of a file. @type afile: string @return: a tuple with the directory name, the docname, extension and version. @rtype: (dirname, docname, extension, version) @raise ValueError: in case version does not exist it will. @note: if a URL is provided, the scheme will be part of the dirname. """ version_str = afile.split(';')[-1] version = int(version_str) afile = afile[:-len(version_str)-1] basename = os.path.basename(afile) dirname = afile[:-len(basename)-1] base = file_strip_ext(basename) extension = basename[len(base) + 1:] if extension: extension = '.' + extension return (dirname, base, extension, version) def get_subformat_from_format(format): """ @return the subformat if any. @rtype: string >>> get_superformat_from_format('foo;bar') 'bar' >>> get_superformat_from_format('foo') '' """ try: return format[format.rindex(';') + 1:] except ValueError: return '' def get_superformat_from_format(format): """ @return the superformat if any. @rtype: string >>> get_superformat_from_format('foo;bar') 'foo' >>> get_superformat_from_format('foo') 'foo' """ try: return format[:format.rindex(';')] except ValueError: return format def propose_next_docname(docname): """ Given a I{docname}, suggest a new I{docname} (useful when trying to generate a unique I{docname}). >>> propose_next_docname('foo') 'foo_1' >>> propose_next_docname('foo_1') 'foo_2' >>> propose_next_docname('foo_10') 'foo_11' @param docname: the base docname. @type docname: string @return: the next possible docname based on the given one. @rtype: string """ if '_' in docname: split_docname = docname.split('_') try: split_docname[-1] = str(int(split_docname[-1]) + 1) docname = '_'.join(split_docname) except ValueError: docname += '_1' else: docname += '_1' return docname class BibRecDocs: """ This class represents all the files attached to one record. @param recid: the record identifier. @type recid: integer @param deleted_too: whether to consider deleted documents as normal documents (useful when trying to recover deleted information). @type deleted_too: bool @param human_readable: whether numbers should be printed in human readable format (e.g. 2048 bytes -> 2Kb) @ivar id: the record identifier as passed to the constructor. @type id: integer @ivar human_readable: the human_readable flag as passed to the constructor. @type human_readable: bool @ivar deleted_too: the deleted_too flag as passed to the constructor. @type deleted_too: bool @ivar bibdocs: the list of documents attached to the record. @type bibdocs: list of BibDoc """ def __init__(self, recid, deleted_too=False, human_readable=False): try: self.id = int(recid) except ValueError: raise ValueError("BibRecDocs: recid is %s but must be an integer." % repr(recid)) self.human_readable = human_readable self.deleted_too = deleted_too self.bibdocs = [] self.build_bibdoc_list() def __repr__(self): """ @return: the canonical string representation of the C{BibRecDocs}. @rtype: string """ return 'BibRecDocs(%s%s%s)' % (self.id, self.deleted_too and ', True' or '', self.human_readable and ', True' or '' ) def __str__(self): """ @return: an easy to be I{grepped} string representation of the whole C{BibRecDocs} content. @rtype: string """ out = '%i::::total bibdocs attached=%i\n' % (self.id, len(self.bibdocs)) out += '%i::::total size latest version=%s\n' % (self.id, nice_size(self.get_total_size_latest_version())) out += '%i::::total size all files=%s\n' % (self.id, nice_size(self.get_total_size())) for bibdoc in self.bibdocs: out += str(bibdoc) return out def empty_p(self): """ @return: True when the record has no attached documents. @rtype: bool """ return len(self.bibdocs) == 0 def deleted_p(self): """ @return: True if the corresponding record has been deleted. @rtype: bool """ from invenio.search_engine import record_exists return record_exists(self.id) == -1 def get_xml_8564(self): """ Return a snippet of I{MARCXML} representing the I{8564} fields corresponding to the current state. @return: the MARCXML representation. @rtype: string """ from invenio.search_engine import get_record out = '' record = get_record(self.id) fields = record_get_field_instances(record, '856', '4', ' ') for field in fields: urls = field_get_subfield_values(field, 'u') if urls and not bibdocfile_url_p(urls[0]): out += '\t\n' for subfield, value in field_get_subfield_instances(field): out += '\t\t%s\n' % (subfield, encode_for_xml(value)) out += '\t\n' for afile in self.list_latest_files(list_hidden=False): out += '\t\n' url = afile.get_url() description = afile.get_description() comment = afile.get_comment() if url: out += '\t\t%s\n' % encode_for_xml(url) if description: out += '\t\t%s\n' % encode_for_xml(description) if comment: out += '\t\t%s\n' % encode_for_xml(comment) out += '\t\n' return out def get_total_size_latest_version(self): """ Returns the total size used on disk by all the files belonging to this record and corresponding to the latest version. @return: the total size. @rtype: integer """ size = 0 for bibdoc in self.bibdocs: size += bibdoc.get_total_size_latest_version() return size def get_total_size(self): """ Return the total size used on disk of all the files belonging to this record of any version (not only the last as in L{get_total_size_latest_version}). @return: the total size. @rtype: integer """ size = 0 for bibdoc in self.bibdocs: size += bibdoc.get_total_size() return size def build_bibdoc_list(self): """ This method must be called everytime a I{bibdoc} is added, removed or modified. """ self.bibdocs = [] if self.deleted_too: res = run_sql("""SELECT id_bibdoc, type FROM bibrec_bibdoc JOIN bibdoc ON id=id_bibdoc WHERE id_bibrec=%s ORDER BY docname ASC""", (self.id,)) else: res = run_sql("""SELECT id_bibdoc, type FROM bibrec_bibdoc JOIN bibdoc ON id=id_bibdoc WHERE id_bibrec=%s AND status<>'DELETED' ORDER BY docname ASC""", (self.id,)) for row in res: cur_doc = BibDoc(docid=row[0], recid=self.id, doctype=row[1], human_readable=self.human_readable) self.bibdocs.append(cur_doc) def list_bibdocs(self, doctype=''): """ Returns the list all bibdocs object belonging to a recid. If C{doctype} is set, it returns just the bibdocs of that doctype. @param doctype: the optional doctype. @type doctype: string @return: the list of bibdocs. @rtype: list of BibDoc """ if not doctype: return self.bibdocs else: return [bibdoc for bibdoc in self.bibdocs if doctype == bibdoc.doctype] def get_bibdoc_names(self, doctype=''): """ Returns all the names of the documents associated with the bibdoc. If C{doctype} is set, restrict the result to all the matching doctype. @param doctype: the optional doctype. @type doctype: string @return: the list of document names. @rtype: list of string """ return [bibdoc.docname for bibdoc in self.list_bibdocs(doctype)] def propose_unique_docname(self, docname): """ Given C{docname}, return a new docname that is not already attached to the record. @param docname: the reference docname. @type docname: string @return: a docname not already attached. @rtype: string """ docname = normalize_docname(docname) goodname = docname i = 1 while goodname in self.get_bibdoc_names(): i += 1 goodname = "%s_%s" % (docname, i) return goodname def merge_bibdocs(self, docname1, docname2): """ This method merge C{docname2} into C{docname1}. 1. Given all the formats of the latest version of the files attached to C{docname2}, these files are added as new formats into C{docname1}. 2. C{docname2} is marked as deleted. @raise InvenioWebSubmitFileError: if at least one format in C{docname2} already exists in C{docname1}. (In this case the two bibdocs are preserved) @note: comments and descriptions are also copied. @note: if C{docname2} has a I{restriction}(i.e. if the I{status} is set) and C{docname1} doesn't, the restriction is imported. """ bibdoc1 = self.get_bibdoc(docname1) bibdoc2 = self.get_bibdoc(docname2) ## Check for possibility for bibdocfile in bibdoc2.list_latest_files(): format = bibdocfile.get_format() if bibdoc1.format_already_exists_p(format): raise InvenioWebSubmitFileError('Format %s already exists in bibdoc %s of record %s. It\'s impossible to merge bibdoc %s into it.' % (format, docname1, self.id, docname2)) ## Importing restriction if needed. restriction1 = bibdoc1.get_status() restriction2 = bibdoc2.get_status() if restriction2 and not restriction1: bibdoc1.set_status(restriction2) ## Importing formats for bibdocfile in bibdoc2.list_latest_files(): format = bibdocfile.get_format() comment = bibdocfile.get_comment() description = bibdocfile.get_description() bibdoc1.add_file_new_format(bibdocfile.get_full_path(), description=description, comment=comment, format=format) ## Finally deleting old bibdoc2 bibdoc2.delete() self.build_bibdoc_list() def get_docid(self, docname): """ @param docname: the document name. @type docname: string @return: the identifier corresponding to the given C{docname}. @rtype: integer @raise InvenioWebSubmitFileError: if the C{docname} does not corresponds to a document attached to this record. """ for bibdoc in self.bibdocs: if bibdoc.docname == docname: return bibdoc.id raise InvenioWebSubmitFileError, "Recid '%s' is not connected with a " \ "docname '%s'" % (self.id, docname) def get_docname(self, docid): """ @param docid: the document identifier. @type docid: integer @return: the name of the document corresponding to the given document identifier. @rtype: string @raise InvenioWebSubmitFileError: if the C{docid} does not corresponds to a document attached to this record. """ for bibdoc in self.bibdocs: if bibdoc.id == docid: return bibdoc.docname raise InvenioWebSubmitFileError, "Recid '%s' is not connected with a " \ "docid '%s'" % (self.id, docid) def has_docname_p(self, docname): """ @param docname: the document name, @type docname: string @return: True if a document with the given name is attached to this record. @rtype: bool """ for bibdoc in self.bibdocs: if bibdoc.docname == docname: return True return False def get_bibdoc(self, docname): """ @return: the bibdoc with a particular docname associated with this recid""" for bibdoc in self.bibdocs: if bibdoc.docname == docname: return bibdoc raise InvenioWebSubmitFileError, "Recid '%s' is not connected with " \ " docname '%s'" % (self.id, docname) def delete_bibdoc(self, docname): """ Deletes the document with the specified I{docname}. @param docname: the document name. @type docname: string """ for bibdoc in self.bibdocs: if bibdoc.docname == docname: bibdoc.delete() self.build_bibdoc_list() def add_bibdoc(self, doctype="Main", docname='file', never_fail=False): """ Add a new empty document object (a I{bibdoc}) to the list of documents of this record. @param doctype: the document type. @type doctype: string @param docname: the document name. @type docname: string @param never_fail: if True, this procedure will not fail, even if a document with the given name is already attached to this record. In this case a new name will be generated (see L{propose_unique_docname}). @type never_fail: bool @return: the newly created document object. @rtype: BibDoc @raise InvenioWebSubmitFileError: in case of any error. """ try: docname = normalize_docname(docname) if never_fail: docname = self.propose_unique_docname(docname) if docname in self.get_bibdoc_names(): raise InvenioWebSubmitFileError, "%s has already a bibdoc with docname %s" % (self.id, docname) else: bibdoc = BibDoc(recid=self.id, doctype=doctype, docname=docname, human_readable=self.human_readable) self.build_bibdoc_list() return bibdoc except Exception, e: register_exception() raise InvenioWebSubmitFileError(str(e)) def add_new_file(self, fullpath, doctype="Main", docname=None, never_fail=False, description=None, comment=None, format=None, flags=None): """ Directly add a new file to this record. Adds a new file with the following policy: - if the C{docname} is not set it is retrieved from the name of the file. - If a bibdoc with the given docname doesn't already exist, it is created and the file is added to it. - It it exist but it doesn't contain the format that is being added, the new format is added. - If the format already exists then if C{never_fail} is True a new bibdoc is created with a similar name but with a progressive number as a suffix and the file is added to it (see L{propose_unique_docname}). @param fullpath: the filesystme path of the document to be added. @type fullpath: string @param doctype: the type of the document. @type doctype: string @param docname: the document name. @type docname: string @param never_fail: if True, this procedure will not fail, even if a document with the given name is already attached to this record. In this case a new name will be generated (see L{propose_unique_docname}). @type never_fail: bool @param description: an optional description of the file. @type description: string @param comment: an optional comment to the file. @type comment: string @param format: the extension of the file. If not specified it will be guessed (see L{guess_format_from_url}). @type format: string @param flags: a set of flags to be associated with the file (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}) @type flags: list of string @return: the elaborated document object. @rtype: BibDoc @raise InvenioWebSubmitFileError: in case of error. """ if docname is None: docname = decompose_file(fullpath)[1] if format is None: format = decompose_file(fullpath)[2] docname = normalize_docname(docname) try: bibdoc = self.get_bibdoc(docname) except InvenioWebSubmitFileError: # bibdoc doesn't already exists! bibdoc = self.add_bibdoc(doctype, docname, False) bibdoc.add_file_new_version(fullpath, description=description, comment=comment, format=format, flags=flags) self.build_bibdoc_list() else: try: bibdoc.add_file_new_format(fullpath, description=description, comment=comment, format=format, flags=flags) self.build_bibdoc_list() except InvenioWebSubmitFileError, e: # Format already exist! if never_fail: bibdoc = self.add_bibdoc(doctype, docname, True) bibdoc.add_file_new_version(fullpath, description=description, comment=comment, format=format, flags=flags) self.build_bibdoc_list() else: raise return bibdoc def add_new_version(self, fullpath, docname=None, description=None, comment=None, format=None, flags=None): """ Adds a new file to an already existent document object as a new version. @param fullpath: the filesystem path of the file to be added. @type fullpath: string @param docname: the document name. If not specified it will be extracted from C{fullpath} (see L{decompose_file}). @type docname: string @param description: an optional description for the file. @type description: string @param comment: an optional comment to the file. @type comment: string @param format: the extension of the file. If not specified it will be guessed (see L{guess_format_from_url}). @type format: string @param flags: a set of flags to be associated with the file (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}) @type flags: list of string @return: the elaborated document object. @rtype: BibDoc @raise InvenioWebSubmitFileError: in case of error. @note: previous files associated with the same document will be considered obsolete. """ if docname is None: docname = decompose_file(fullpath)[1] if format is None: format = decompose_file(fullpath)[2] if flags is None: flags = [] if 'pdfa' in get_subformat_from_format(format).split(';') and not 'PDF/A' in flags: flags.append('PDF/A') bibdoc = self.get_bibdoc(docname=docname) bibdoc.add_file_new_version(fullpath, description=description, comment=comment, format=format, flags=flags) self.build_bibdoc_list() return bibdoc def add_new_format(self, fullpath, docname=None, description=None, comment=None, format=None, flags=None): """ Adds a new file to an already existent document object as a new format. @param fullpath: the filesystem path of the file to be added. @type fullpath: string @param docname: the document name. If not specified it will be extracted from C{fullpath} (see L{decompose_file}). @type docname: string @param description: an optional description for the file. @type description: string @param comment: an optional comment to the file. @type comment: string @param format: the extension of the file. If not specified it will be guessed (see L{guess_format_from_url}). @type format: string @param flags: a set of flags to be associated with the file (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}) @type flags: list of string @return: the elaborated document object. @rtype: BibDoc @raise InvenioWebSubmitFileError: in case the same format already exists. """ if docname is None: docname = decompose_file(fullpath)[1] if format is None: format = decompose_file(fullpath)[2] if flags is None: flags = [] if 'pdfa' in get_subformat_from_format(format).split(';') and not 'PDF/A' in flags: flags.append('PDF/A') bibdoc = self.get_bibdoc(docname=docname) bibdoc.add_file_new_format(fullpath, description=description, comment=comment, format=format, flags=flags) self.build_bibdoc_list() return bibdoc def list_latest_files(self, doctype='', list_hidden=True): """ Returns a list of the latest files. @param doctype: if set, only document of the given type will be listed. @type doctype: string @param list_hidden: if True, will list also files with the C{HIDDEN} flag being set. @type list_hidden: bool @return: the list of latest files. @rtype: list of BibDocFile """ docfiles = [] for bibdoc in self.list_bibdocs(doctype): docfiles += bibdoc.list_latest_files(list_hidden=list_hidden) return docfiles def display(self, docname="", version="", doctype="", ln=CFG_SITE_LANG, verbose=0, display_hidden=True): """ Returns an HTML representation of the the attached documents. @param docname: if set, include only the requested document. @type docname: string @param version: if not set, only the last version will be displayed. If 'all', all versions will be displayed. @type version: string (integer or 'all') @param doctype: is set, include only documents of the requested type. @type doctype: string @param ln: the language code. @type ln: string @param verbose: if greater than 0, includes debug information. @type verbose: integer @param display_hidden: whether to include hidden files as well. @type display_hidden: bool @return: the formatted representation. @rtype: HTML string """ t = "" if docname: try: bibdocs = [self.get_bibdoc(docname)] except InvenioWebSubmitFileError: bibdocs = self.list_bibdocs(doctype) else: bibdocs = self.list_bibdocs(doctype) if bibdocs: types = list_types_from_array(bibdocs) fulltypes = [] for mytype in types: if mytype in ('Plot', 'PlotMisc'): # FIXME: quick hack to ignore plot-like doctypes # on Files tab continue fulltype = { 'name' : mytype, 'content' : [], } for bibdoc in bibdocs: if mytype == bibdoc.get_type(): fulltype['content'].append(bibdoc.display(version, ln=ln, display_hidden=display_hidden)) fulltypes.append(fulltype) if verbose >= 9: verbose_files = str(self) else: verbose_files = '' t = websubmit_templates.tmpl_bibrecdoc_filelist( ln=ln, types = fulltypes, verbose_files=verbose_files ) return t def fix(self, docname): """ Algorithm that transform a broken/old bibdoc into a coherent one. Think of it as being the fsck of BibDocs. - All the files in the bibdoc directory will be renamed according to the document name. Proper .recid, .type, .md5 files will be created/updated. - In case of more than one file with the same format version a new bibdoc will be created in order to put does files. @param docname: the document name that need to be fixed. @type docname: string @return: the list of newly created bibdocs if any. @rtype: list of BibDoc @raise InvenioWebSubmitFileError: in case of issues that can not be fixed automatically. """ bibdoc = self.get_bibdoc(docname) versions = {} res = [] new_bibdocs = [] # List of files with the same version/format of # existing file which need new bibdoc. counter = 0 zero_version_bug = False if os.path.exists(bibdoc.basedir): for filename in os.listdir(bibdoc.basedir): if filename[0] != '.' and ';' in filename: name, version = filename.split(';') try: version = int(version) except ValueError: # Strange name register_exception() raise InvenioWebSubmitFileError, "A file called %s exists under %s. This is not a valid name. After the ';' there must be an integer representing the file version. Please, manually fix this file either by renaming or by deleting it." % (filename, bibdoc.basedir) if version == 0: zero_version_bug = True format = name[len(file_strip_ext(name)):] format = normalize_format(format) if not versions.has_key(version): versions[version] = {} new_name = 'FIXING-%s-%s' % (str(counter), name) try: shutil.move('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, new_name)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in renaming '%s' to '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, new_name), e) if versions[version].has_key(format): new_bibdocs.append((new_name, version)) else: versions[version][format] = new_name counter += 1 elif filename[0] != '.': # Strange name register_exception() raise InvenioWebSubmitFileError, "A file called %s exists under %s. This is not a valid name. There should be a ';' followed by an integer representing the file version. Please, manually fix this file either by renaming or by deleting it." % (filename, bibdoc.basedir) else: # we create the corresponding storage directory old_umask = os.umask(022) os.makedirs(bibdoc.basedir) # and save the father record id if it exists try: if self.id != "": recid_fd = open("%s/.recid" % bibdoc.basedir, "w") recid_fd.write(str(self.id)) recid_fd.close() if bibdoc.doctype != "": type_fd = open("%s/.type" % bibdoc.basedir, "w") type_fd.write(str(bibdoc.doctype)) type_fd.close() except Exception, e: register_exception() raise InvenioWebSubmitFileError, e os.umask(old_umask) if not versions: bibdoc.delete() else: for version, formats in versions.iteritems(): if zero_version_bug: version += 1 for format, filename in formats.iteritems(): destination = '%s%s;%i' % (docname, format, version) try: shutil.move('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, destination)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in renaming '%s' to '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, destination), e) try: recid_fd = open("%s/.recid" % bibdoc.basedir, "w") recid_fd.write(str(self.id)) recid_fd.close() type_fd = open("%s/.type" % bibdoc.basedir, "w") type_fd.write(str(bibdoc.doctype)) type_fd.close() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in creating .recid and .type file for '%s' folder: '%s'" % (bibdoc.basedir, e) self.build_bibdoc_list() res = [] for (filename, version) in new_bibdocs: if zero_version_bug: version += 1 new_bibdoc = self.add_bibdoc(doctype=bibdoc.doctype, docname=docname, never_fail=True) new_bibdoc.add_file_new_format('%s/%s' % (bibdoc.basedir, filename), version) res.append(new_bibdoc) try: os.remove('%s/%s' % (bibdoc.basedir, filename)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in removing '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), e) Md5Folder(bibdoc.basedir).update(only_new=False) bibdoc._build_file_list() self.build_bibdoc_list() for bibdoc in self.bibdocs: if not run_sql('SELECT more_info FROM bibdoc WHERE id=%s', (bibdoc.id,)): ## Import from MARC only if the bibdoc has never had ## its more_info initialized. try: bibdoc.import_descriptions_and_comments_from_marc() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Error in importing description and comment from %s for record %s: %s" % (repr(bibdoc), self.id, e) return res def check_format(self, docname): """ Check for any format related issue. In case L{CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS} is altered or Python version changes, it might happen that a docname contains files which are no more docname + .format ; version, simply because the .format is now recognized (and it was not before, so it was contained into the docname). This algorithm verify if it is necessary to fix (seel L{fix_format}). @param docname: the document name whose formats should be verified. @type docname: string @return: True if format is correct. False if a fix is needed. @rtype: bool @raise InvenioWebSubmitFileError: in case of any error. """ bibdoc = self.get_bibdoc(docname) correct_docname = decompose_file(docname + '.pdf')[1] if docname != correct_docname: return False for filename in os.listdir(bibdoc.basedir): if not filename.startswith('.'): try: dummy, dummy, format, version = decompose_file_with_version(filename) except Exception: raise InvenioWebSubmitFileError('Incorrect filename "%s" for docname %s for recid %i' % (filename, docname, self.id)) if '%s%s;%i' % (correct_docname, format, version) != filename: return False return True def check_duplicate_docnames(self): """ Check wethever the record is connected with at least tho documents with the same name. @return: True if everything is fine. @rtype: bool """ docnames = set() for docname in self.get_bibdoc_names(): if docname in docnames: return False else: docnames.add(docname) return True def uniformize_bibdoc(self, docname): """ This algorithm correct wrong file name belonging to a bibdoc. @param docname: the document name whose formats should be verified. @type docname: string """ bibdoc = self.get_bibdoc(docname) for filename in os.listdir(bibdoc.basedir): if not filename.startswith('.'): try: dummy, dummy, format, version = decompose_file_with_version(filename) except ValueError: register_exception(alert_admin=True, prefix= "Strange file '%s' is stored in %s" % (filename, bibdoc.basedir)) else: os.rename(os.path.join(bibdoc.basedir, filename), os.path.join(bibdoc.basedir, '%s%s;%i' % (docname, format, version))) Md5Folder(bibdoc.basedir).update() bibdoc.touch() bibdoc._build_file_list('rename') def fix_format(self, docname, skip_check=False): """ Fixes format related inconsistencies. @param docname: the document name whose formats should be verified. @type docname: string @param skip_check: if True assume L{check_format} has already been called and the need for fix has already been found. If False, will implicitly call L{check_format} and skip fixing if no error is found. @type skip_check: bool @return: in case merging two bibdocs is needed but it's not possible. @rtype: bool """ if not skip_check: if self.check_format(docname): return True bibdoc = self.get_bibdoc(docname) correct_docname = decompose_file(docname + '.pdf')[1] need_merge = False if correct_docname != docname: need_merge = self.has_docname_p(correct_docname) if need_merge: proposed_docname = self.propose_unique_docname(correct_docname) run_sql('UPDATE bibdoc SET docname=%s WHERE id=%s', (proposed_docname, bibdoc.id)) self.build_bibdoc_list() self.uniformize_bibdoc(proposed_docname) try: self.merge_bibdocs(docname, proposed_docname) except InvenioWebSubmitFileError: return False else: run_sql('UPDATE bibdoc SET docname=%s WHERE id=%s', (correct_docname, bibdoc.id)) self.build_bibdoc_list() self.uniformize_bibdoc(correct_docname) else: self.uniformize_bibdoc(docname) return True def fix_duplicate_docnames(self, skip_check=False): """ Algotirthm to fix duplicate docnames. If a record is connected with at least two bibdoc having the same docname, the algorithm will try to merge them. @param skip_check: if True assume L{check_duplicate_docnames} has already been called and the need for fix has already been found. If False, will implicitly call L{check_duplicate_docnames} and skip fixing if no error is found. @type skip_check: bool """ if not skip_check: if self.check_duplicate_docnames(): return docnames = set() for bibdoc in self.list_bibdocs(): docname = bibdoc.docname if docname in docnames: new_docname = self.propose_unique_docname(bibdoc.docname) bibdoc.change_name(new_docname) self.merge_bibdocs(docname, new_docname) docnames.add(docname) def check_file_exists(self, path): """ Check if a file with the same content of the file pointed in C{path} is already attached to this record. @param path: the file to be checked against. @type path: string @return: True if a file with the requested content is already attached to the record. @rtype: bool """ # Let's consider all the latest files for bibdoc in self.list_bibdocs(): if bibdoc.check_file_exists(path): return True return False class BibDoc: """ This class represents one document (i.e. a set of files with different formats and with versioning information that consitutes a piece of information. To instanciate a new document, the recid and the docname are mandatory. To instanciate an already existing document, either the recid and docname or the docid alone are sufficient to retrieve it. @param docid: the document identifier. @type docid: integer @param recid: the record identifier of the record to which this document belongs to. If the C{docid} is specified the C{recid} is automatically retrieven from the database. @type recid: integer @param docname: the document name. @type docname: string @param doctype: the document type (used when instanciating a new document). @type doctype: string @param human_readable: whether sizes should be represented in a human readable format. @type human_readable: bool @raise InvenioWebSubmitFileError: in case of error. """ def __init__ (self, docid=None, recid=None, docname=None, doctype='Main', human_readable=False): """Constructor of a bibdoc. At least the docid or the recid/docname pair is needed.""" # docid is known, the document already exists if docname: docname = normalize_docname(docname) self.docfiles = [] self.md5s = None self.human_readable = human_readable if docid: if not recid: res = run_sql("SELECT id_bibrec,type FROM bibrec_bibdoc WHERE id_bibdoc=%s LIMIT 1", (docid,), 1) if res: recid = res[0][0] doctype = res[0][1] else: warn("Docid %s is orphan" % docid) else: res = run_sql("SELECT type FROM bibrec_bibdoc WHERE id_bibrec=%s AND id_bibdoc=%s LIMIT 1", (recid, docid,), 1) if res: doctype = res[0][0] else: #this bibdoc isn't associated with the corresponding bibrec. raise InvenioWebSubmitFileError, "Docid %s is not associated with the recid %s" % (docid, recid) # gather the other information res = run_sql("SELECT id,status,docname,creation_date,modification_date,text_extraction_date,more_info FROM bibdoc WHERE id=%s LIMIT 1", (docid,), 1) if res: self.cd = res[0][3] self.md = res[0][4] self.td = res[0][5] self.recid = recid self.docname = res[0][2] self.id = docid self.status = res[0][1] self.more_info = BibDocMoreInfo(docid, blob_to_string(res[0][6])) self.basedir = _make_base_dir(self.id) self.doctype = doctype else: # this bibdoc doesn't exist raise InvenioWebSubmitFileError, "The docid %s does not exist." % docid # else it is a new document else: if not docname: raise InvenioWebSubmitFileError, "You should specify the docname when creating a new bibdoc" else: self.recid = recid self.doctype = doctype self.docname = docname self.status = '' if recid: res = run_sql("SELECT b.id FROM bibrec_bibdoc bb JOIN bibdoc b on bb.id_bibdoc=b.id WHERE bb.id_bibrec=%s AND b.docname=%s LIMIT 1", (recid, docname), 1) if res: raise InvenioWebSubmitFileError("A bibdoc called %s already exists for recid %s" % (docname, recid)) self.id = run_sql("INSERT INTO bibdoc (status,docname,creation_date,modification_date) " "values(%s,%s,NOW(),NOW())", (self.status, docname)) if self.id: # we link the document to the record if a recid was # specified self.more_info = BibDocMoreInfo(self.id) res = run_sql("SELECT creation_date, modification_date, text_extraction_date FROM bibdoc WHERE id=%s", (self.id,)) self.cd = res[0][0] self.md = res[0][1] self.td = res[0][2] else: raise InvenioWebSubmitFileError, "New docid cannot be created" try: self.basedir = _make_base_dir(self.id) # we create the corresponding storage directory if not os.path.exists(self.basedir): old_umask = os.umask(022) os.makedirs(self.basedir) # and save the father record id if it exists try: if self.recid: recid_fd = open("%s/.recid" % self.basedir, "w") recid_fd.write(str(self.recid)) recid_fd.close() if self.doctype: type_fd = open("%s/.type" % self.basedir, "w") type_fd.write(str(self.doctype)) type_fd.close() except Exception, e: register_exception(alert_admin=True) raise InvenioWebSubmitFileError, e os.umask(old_umask) if self.recid: run_sql("INSERT INTO bibrec_bibdoc (id_bibrec, id_bibdoc, type) VALUES (%s,%s,%s)", (recid, self.id, self.doctype,)) except Exception, e: run_sql('DELETE FROM bibdoc WHERE id=%s', (self.id, )) run_sql('DELETE FROM bibrec_bibdoc WHERE id_bibdoc=%s', (self.id, )) register_exception(alert_admin=True) raise InvenioWebSubmitFileError, e # build list of attached files self._build_file_list('init') def __repr__(self): """ @return: the canonical string representation of the C{BibDoc}. @rtype: string """ return 'BibDoc(%s, %s, %s, %s, %s)' % (repr(self.id), repr(self.recid), repr(self.docname), repr(self.doctype), repr(self.human_readable)) def __str__(self): """ @return: an easy to be I{grepped} string representation of the whole C{BibDoc} content. @rtype: string """ out = '%s:%i:::docname=%s\n' % (self.recid or '', self.id, self.docname) out += '%s:%i:::doctype=%s\n' % (self.recid or '', self.id, self.doctype) out += '%s:%i:::status=%s\n' % (self.recid or '', self.id, self.status) out += '%s:%i:::basedir=%s\n' % (self.recid or '', self.id, self.basedir) out += '%s:%i:::creation date=%s\n' % (self.recid or '', self.id, self.cd) out += '%s:%i:::modification date=%s\n' % (self.recid or '', self.id, self.md) out += '%s:%i:::text extraction date=%s\n' % (self.recid or '', self.id, self.td) out += '%s:%i:::total file attached=%s\n' % (self.recid or '', self.id, len(self.docfiles)) if self.human_readable: out += '%s:%i:::total size latest version=%s\n' % (self.recid or '', self.id, nice_size(self.get_total_size_latest_version())) out += '%s:%i:::total size all files=%s\n' % (self.recid or '', self.id, nice_size(self.get_total_size())) else: out += '%s:%i:::total size latest version=%s\n' % (self.recid or '', self.id, self.get_total_size_latest_version()) out += '%s:%i:::total size all files=%s\n' % (self.recid or '', self.id, self.get_total_size()) for docfile in self.docfiles: out += str(docfile) return out def format_already_exists_p(self, format): """ @param format: a format to be checked. @type format: string @return: True if a file of the given format already exists among the latest files. @rtype: bool """ format = normalize_format(format) for afile in self.list_latest_files(): if format == afile.get_format(): return True return False def get_status(self): """ @return: the status information. @rtype: string """ return self.status def get_text(self, version=None): """ @param version: the requested version. If not set, the latest version will be used. @type version: integer @return: the textual content corresponding to the specified version of the document. @rtype: string """ if version is None: version = self.get_latest_version() if self.has_text(version): return open(os.path.join(self.basedir, '.text;%i' % version)).read() else: return "" def get_text_path(self, version=None): """ @param version: the requested version. If not set, the latest version will be used. @type version: int @return: the full path to the textual content corresponding to the specified version of the document. @rtype: string """ if version is None: version = self.get_latest_version() if self.has_text(version): return os.path.join(self.basedir, '.text;%i' % version) else: return "" def extract_text(self, version=None, perform_ocr=False, ln='en'): """ Try what is necessary to extract the textual information of a document. @param version: the version of the document for which text is required. If not specified the text will be retrieved from the last version. @type version: integer @param perform_ocr: whether to perform OCR. @type perform_ocr: bool @param ln: a two letter language code to give as a hint to the OCR procedure. @type ln: string @raise InvenioWebSubmitFileError: in case of error. @note: the text is extracted and cached for later use. Use L{get_text} to retrieve it. """ from invenio.websubmit_file_converter import get_best_format_to_extract_text_from, convert_file, InvenioWebSubmitFileConverterError if version is None: version = self.get_latest_version() docfiles = self.list_version_files(version) ## We try to extract text only from original or OCRed documents. filenames = [docfile.get_full_path() for docfile in docfiles if 'CONVERTED' not in docfile.flags or 'OCRED' in docfile.flags] try: filename = get_best_format_to_extract_text_from(filenames) except InvenioWebSubmitFileConverterError: ## We fall back on considering all the documents filenames = [docfile.get_full_path() for docfile in docfiles] try: filename = get_best_format_to_extract_text_from(filenames) except InvenioWebSubmitFileConverterError: open(os.path.join(self.basedir, '.text;%i' % version), 'w').write('') return try: convert_file(filename, os.path.join(self.basedir, '.text;%i' % version), '.txt', perform_ocr=perform_ocr, ln=ln) if version == self.get_latest_version(): run_sql("UPDATE bibdoc SET text_extraction_date=NOW() WHERE id=%s", (self.id, )) except InvenioWebSubmitFileConverterError, e: register_exception(alert_admin=True, prefix="Error in extracting text from bibdoc %i, version %i" % (self.id, version)) raise InvenioWebSubmitFileError, str(e) def touch(self): """ Update the modification time of the bibdoc (as in the UNIX command C{touch}). """ run_sql('UPDATE bibdoc SET modification_date=NOW() WHERE id=%s', (self.id, )) #if self.recid: #run_sql('UPDATE bibrec SET modification_date=NOW() WHERE id=%s', (self.recid, )) def set_status(self, new_status): """ Set a new status. A document with a status information is a restricted document that can be accessed only to user which as an authorization to the I{viewrestrdoc} WebAccess action with keyword status with value C{new_status}. @param new_status: the new status. If empty the document will be unrestricted. @type new_status: string @raise InvenioWebSubmitFileError: in case the reserved word 'DELETED' is used. """ if new_status != KEEP_OLD_VALUE: if new_status == 'DELETED': raise InvenioWebSubmitFileError('DELETED is a reserved word and can not be used for setting the status') run_sql('UPDATE bibdoc SET status=%s WHERE id=%s', (new_status, self.id)) self.status = new_status self.touch() self._build_file_list() def add_file_new_version(self, filename, description=None, comment=None, format=None, flags=None): """ Add a new version of a file. If no physical file is already attached to the document a the given file will have version 1. Otherwise the new file will have the current version number plus one. @param filename: the local path of the file. @type filename: string @param description: an optional description for the file. @type description: string @param comment: an optional comment to the file. @type comment: string @param format: the extension of the file. If not specified it will be retrieved from the filename (see L{decompose_file}). @type format: string @param flags: a set of flags to be associated with the file (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}) @type flags: list of string @raise InvenioWebSubmitFileError: in case of error. """ try: latestVersion = self.get_latest_version() if latestVersion == 0: myversion = 1 else: myversion = latestVersion + 1 if os.path.exists(filename): if not os.path.getsize(filename) > 0: raise InvenioWebSubmitFileError, "%s seems to be empty" % filename if format is None: format = decompose_file(filename)[2] else: format = normalize_format(format) destination = "%s/%s%s;%i" % (self.basedir, self.docname, format, myversion) + if run_sql("SELECT id_bibdoc FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=%s AND format=%s", (self.id, myversion, format)): + raise InvenioWebSubmitFileError("According to the database a file of format %s is already attached to the docid %s" % (format, self.id)) try: shutil.copyfile(filename, destination) os.chmod(destination, 0644) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (filename, destination, e) self.more_info.set_description(description, format, myversion) self.more_info.set_comment(comment, format, myversion) if flags is None: flags = [] if 'pdfa' in get_subformat_from_format(format).split(';') and not 'PDF/A' in flags: flags.append('PDF/A') for flag in flags: if flag == 'PERFORM_HIDE_PREVIOUS': for afile in self.list_all_files(): format = afile.get_format() version = afile.get_version() if version < myversion: self.more_info.set_flag('HIDDEN', format, myversion) else: self.more_info.set_flag(flag, format, myversion) else: raise InvenioWebSubmitFileError, "'%s' does not exists!" % filename finally: self.touch() Md5Folder(self.basedir).update() self._build_file_list() + just_added_file = self.get_file(format, myversion) + run_sql("INSERT INTO bibdocfsinfo(id_bibdoc, version, format, last_version, cd, md, checksum, filesize, mime) VALUES(%s, %s, %s, true, %s, %s, %s, %s, %s)", (self.id, myversion, format, just_added_file.cd, just_added_file.md, just_added_file.get_checksum(), just_added_file.get_size(), just_added_file.mime)) + run_sql("UPDATE bibdocfsinfo SET last_version=false WHERE id_bibdoc=%s AND version<%s", (self.id, myversion)) def add_file_new_format(self, filename, version=None, description=None, comment=None, format=None, flags=None): """ Add a file as a new format. @param filename: the local path of the file. @type filename: string @param version: an optional specific version to which the new format should be added. If None, the last version will be used. @type version: integer @param description: an optional description for the file. @type description: string @param comment: an optional comment to the file. @type comment: string @param format: the extension of the file. If not specified it will be retrieved from the filename (see L{decompose_file}). @type format: string @param flags: a set of flags to be associated with the file (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}) @type flags: list of string @raise InvenioWebSubmitFileError: if the given format already exists. """ try: if version is None: version = self.get_latest_version() if version == 0: version = 1 if os.path.exists(filename): if not os.path.getsize(filename) > 0: raise InvenioWebSubmitFileError, "%s seems to be empty" % filename if format is None: format = decompose_file(filename)[2] else: format = normalize_format(format) + if run_sql("SELECT id_bibdoc FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=%s AND format=%s", (self.id, version, format)): + raise InvenioWebSubmitFileError("According to the database a file of format %s is already attached to the docid %s" % (format, self.id)) destination = "%s/%s%s;%i" % (self.basedir, self.docname, format, version) if os.path.exists(destination): raise InvenioWebSubmitFileError, "A file for docname '%s' for the recid '%s' already exists for the format '%s'" % (self.docname, self.recid, format) try: shutil.copyfile(filename, destination) os.chmod(destination, 0644) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (filename, destination, e) self.more_info.set_comment(comment, format, version) self.more_info.set_description(description, format, version) if flags is None: flags = [] if 'pdfa' in get_subformat_from_format(format).split(';') and not 'PDF/A' in flags: flags.append('PDF/A') for flag in flags: if flag != 'PERFORM_HIDE_PREVIOUS': self.more_info.set_flag(flag, format, version) else: raise InvenioWebSubmitFileError, "'%s' does not exists!" % filename finally: Md5Folder(self.basedir).update() self.touch() self._build_file_list() + just_added_file = self.get_file(format, version) + run_sql("INSERT INTO bibdocfsinfo(id_bibdoc, version, format, last_version, cd, md, checksum, filesize, mime) VALUES(%s, %s, %s, true, %s, %s, %s, %s, %s)", (self.id, version, format, just_added_file.cd, just_added_file.md, just_added_file.get_checksum(), just_added_file.get_size(), just_added_file.mime)) def purge(self): """ Physically removes all the previous version of the given bibdoc. Everything but the last formats will be erased. """ version = self.get_latest_version() if version > 1: for afile in self.docfiles: if afile.get_version() < version: self.more_info.unset_comment(afile.get_format(), afile.get_version()) self.more_info.unset_description(afile.get_format(), afile.get_version()) for flag in CFG_BIBDOCFILE_AVAILABLE_FLAGS: self.more_info.unset_flag(flag, afile.get_format(), afile.get_version()) try: os.remove(afile.get_full_path()) except Exception, e: register_exception() Md5Folder(self.basedir).update() self.touch() self._build_file_list() + run_sql("DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s AND version<%s", (self.id, version)) def expunge(self): """ Physically remove all the traces of a given document. @note: an expunged BibDoc object shouldn't be used anymore or the result might be unpredicted. """ del self.md5s del self.more_info os.system('rm -rf %s' % escape_shell_arg(self.basedir)) run_sql('DELETE FROM bibrec_bibdoc WHERE id_bibdoc=%s', (self.id, )) run_sql('DELETE FROM bibdoc_bibdoc WHERE id_bibdoc1=%s OR id_bibdoc2=%s', (self.id, self.id)) run_sql('DELETE FROM bibdoc WHERE id=%s', (self.id, )) run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, doctimestamp) VALUES("EXPUNGE", %s, %s, NOW())', (self.id, self.docname)) + run_sql('DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s', (self.id, )) del self.docfiles del self.id del self.cd del self.md del self.td del self.basedir del self.recid del self.doctype del self.docname def revert(self, version): """ Revert the document to a given version. All the formats corresponding to that version are copied forward to a new version. @param version: the version to revert to. @type version: integer @raise InvenioWebSubmitFileError: in case of errors """ - try: - version = int(version) - new_version = self.get_latest_version() + 1 - for docfile in self.list_version_files(version): - destination = "%s/%s%s;%i" % (self.basedir, self.docname, docfile.get_format(), new_version) - if os.path.exists(destination): - raise InvenioWebSubmitFileError, "A file for docname '%s' for the recid '%s' already exists for the format '%s'" % (self.docname, self.recid, docfile.get_format()) - try: - shutil.copyfile(docfile.get_full_path(), destination) - os.chmod(destination, 0644) - self.more_info.set_comment(self.more_info.get_comment(docfile.get_format(), version), docfile.get_format(), new_version) - self.more_info.set_description(self.more_info.get_description(docfile.get_format(), version), docfile.get_format(), new_version) - except Exception, e: - register_exception() - raise InvenioWebSubmitFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (docfile.get_full_path(), destination, e) - finally: - Md5Folder(self.basedir).update() - self.touch() - self._build_file_list() + version = int(version) + docfiles = self.list_version_files(version) + if docfiles: + self.add_file_new_version(docfiles[0].get_full_path(), description=docfiles[0].get_description(), comment=docfiles[0].get_comment(), format=docfiles[0].get_format(), flags=docfiles[0].flags) + for docfile in docfiles[1:]: + self.add_file_new_format(docfile.filename, description=docfile.get_description(), comment=docfile.get_comment(), format=docfile.get_format(), flags=docfile.flags) def import_descriptions_and_comments_from_marc(self, record=None): """ Import descriptions and comments from the corresponding MARC metadata. @param record: the record (if None it will be calculated). @type record: bibrecord recstruct @note: If record is passed it is directly used, otherwise it is retrieved from the MARCXML stored in the database. """ ## Let's get the record from invenio.search_engine import get_record if record is None: record = get_record(self.id) fields = record_get_field_instances(record, '856', '4', ' ') global_comment = None global_description = None local_comment = {} local_description = {} for field in fields: url = field_get_subfield_values(field, 'u') if url: ## Given a url url = url[0] if url == '%s/%s/%s/files/' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid): ## If it is a traditional /CFG_SITE_RECORD/1/files/ one ## We have global description/comment for all the formats description = field_get_subfield_values(field, 'y') if description: global_description = description[0] comment = field_get_subfield_values(field, 'z') if comment: global_comment = comment[0] elif bibdocfile_url_p(url): ## Otherwise we have description/comment per format dummy, docname, format = decompose_bibdocfile_url(url) if docname == self.docname: description = field_get_subfield_values(field, 'y') if description: local_description[format] = description[0] comment = field_get_subfield_values(field, 'z') if comment: local_comment[format] = comment[0] ## Let's update the tables version = self.get_latest_version() for docfile in self.list_latest_files(): format = docfile.get_format() if format in local_comment: self.set_comment(local_comment[format], format, version) else: self.set_comment(global_comment, format, version) if format in local_description: self.set_description(local_description[format], format, version) else: self.set_description(global_description, format, version) self._build_file_list('init') def get_icon(self, subformat_re=CFG_WEBSUBMIT_ICON_SUBFORMAT_RE, display_hidden=True): """ @param subformat_re: by default the convention is that L{CFG_WEBSUBMIT_ICON_SUBFORMAT_RE} is used as a subformat indicator to mean that a particular format is to be used as an icon. Specifiy a different subformat if you need to use a different convention. @type subformat_re: compiled regular expression @return: the bibdocfile corresponding to the icon of this document, or None if any icon exists for this document. @rtype: BibDocFile @warning: before I{subformat} were introduced this method was returning a BibDoc, while now is returning a BibDocFile. Check if your client code is compatible with this. """ for docfile in self.list_latest_files(list_hidden=display_hidden): if subformat_re.match(docfile.get_subformat()): return docfile return None def add_icon(self, filename, format=None, subformat=CFG_WEBSUBMIT_DEFAULT_ICON_SUBFORMAT): """ Attaches icon to this document. @param filename: the local filesystem path to the icon. @type filename: string @param format: an optional format for the icon. If not specified it will be calculated after the filesystem path. @type format: string @param subformat: by default the convention is that CFG_WEBSUBMIT_DEFAULT_ICON_SUBFORMAT is used as a subformat indicator to mean that a particular format is to be used as an icon. Specifiy a different subformat if you need to use a different convention. @type subformat: string @raise InvenioWebSubmitFileError: in case of errors. """ #first check if an icon already exists if not format: format = decompose_file(filename)[2] if subformat: format += ";%s" % subformat self.add_file_new_format(filename, format=format) def delete_icon(self, subformat_re=CFG_WEBSUBMIT_ICON_SUBFORMAT_RE): """ @param subformat_re: by default the convention is that L{CFG_WEBSUBMIT_ICON_SUBFORMAT_RE} is used as a subformat indicator to mean that a particular format is to be used as an icon. Specifiy a different subformat if you need to use a different convention. @type subformat: compiled regular expression Removes the icon attached to the document if it exists. """ for docfile in self.list_latest_files(): if subformat_re.match(docfile.get_subformat()): self.delete_file(docfile.get_format(), docfile.get_version()) def display(self, version="", ln=CFG_SITE_LANG, display_hidden=True): """ Returns an HTML representation of the this document. @param version: if not set, only the last version will be displayed. If 'all', all versions will be displayed. @type version: string (integer or 'all') @param ln: the language code. @type ln: string @param display_hidden: whether to include hidden files as well. @type display_hidden: bool @return: the formatted representation. @rtype: HTML string """ t = "" if version == "all": docfiles = self.list_all_files(list_hidden=display_hidden) elif version != "": version = int(version) docfiles = self.list_version_files(version, list_hidden=display_hidden) else: docfiles = self.list_latest_files(list_hidden=display_hidden) icon = self.get_icon(display_hidden=display_hidden) if icon: imageurl = icon.get_url() else: imageurl = "%s/img/smallfiles.gif" % CFG_SITE_URL versions = [] for version in list_versions_from_array(docfiles): currversion = { 'version' : version, 'previous' : 0, 'content' : [] } if version == self.get_latest_version() and version != 1: currversion['previous'] = 1 for docfile in docfiles: if docfile.get_version() == version: currversion['content'].append(docfile.display(ln = ln)) versions.append(currversion) if versions: return websubmit_templates.tmpl_bibdoc_filelist( ln = ln, versions = versions, imageurl = imageurl, docname = self.docname, recid = self.recid, status = self.status ) else: return "" def change_name(self, newname): """ Renames this document name. @param newname: the new name. @type newname: string @raise InvenioWebSubmitFileError: if the new name corresponds to a document already attached to the record owning this document. """ try: newname = normalize_docname(newname) res = run_sql("SELECT b.id FROM bibrec_bibdoc bb JOIN bibdoc b on bb.id_bibdoc=b.id WHERE bb.id_bibrec=%s AND b.docname=%s", (self.recid, newname)) if res: raise InvenioWebSubmitFileError, "A bibdoc called %s already exists for recid %s" % (newname, self.recid) try: for f in os.listdir(self.basedir): if not f.startswith('.'): try: (dummy, base, extension, version) = decompose_file_with_version(f) except ValueError: register_exception(alert_admin=True, prefix="Strange file '%s' is stored in %s" % (f, self.basedir)) else: shutil.move(os.path.join(self.basedir, f), os.path.join(self.basedir, '%s%s;%i' % (newname, extension, version))) except Exception, e: register_exception() raise InvenioWebSubmitFileError("Error in renaming the bibdoc %s to %s for recid %s: %s" % (self.docname, newname, self.recid, e)) run_sql("update bibdoc set docname=%s where id=%s", (newname, self.id,)) self.docname = newname finally: Md5Folder(self.basedir).update() self.touch() self._build_file_list('rename') def set_comment(self, comment, format, version=None): """ Updates the comment of a specific format/version of the document. @param comment: the new comment. @type comment: string @param format: the specific format for which the comment should be updated. @type format: string @param version: the specific version for which the comment should be updated. If not specified the last version will be used. @type version: integer """ if version is None: version = self.get_latest_version() format = normalize_format(format) self.more_info.set_comment(comment, format, version) self.touch() self._build_file_list('init') def set_description(self, description, format, version=None): """ Updates the description of a specific format/version of the document. @param description: the new description. @type description: string @param format: the specific format for which the description should be updated. @type format: string @param version: the specific version for which the description should be updated. If not specified the last version will be used. @type version: integer """ if version is None: version = self.get_latest_version() format = normalize_format(format) self.more_info.set_description(description, format, version) self.touch() self._build_file_list('init') def set_flag(self, flagname, format, version=None): """ Sets a flag for a specific format/version of the document. @param flagname: a flag from L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}. @type flagname: string @param format: the specific format for which the flag should be set. @type format: string @param version: the specific version for which the flag should be set. If not specified the last version will be used. @type version: integer """ if version is None: version = self.get_latest_version() format = normalize_format(format) self.more_info.set_flag(flagname, format, version) self.touch() self._build_file_list('init') def has_flag(self, flagname, format, version=None): """ Checks if a particular flag for a format/version is set. @param flagname: a flag from L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}. @type flagname: string @param format: the specific format for which the flag should be set. @type format: string @param version: the specific version for which the flag should be set. If not specified the last version will be used. @type version: integer @return: True if the flag is set. @rtype: bool """ if version is None: version = self.get_latest_version() format = normalize_format(format) return self.more_info.has_flag(flagname, format, version) def unset_flag(self, flagname, format, version=None): """ Unsets a flag for a specific format/version of the document. @param flagname: a flag from L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}. @type flagname: string @param format: the specific format for which the flag should be unset. @type format: string @param version: the specific version for which the flag should be unset. If not specified the last version will be used. @type version: integer """ if version is None: version = self.get_latest_version() format = normalize_format(format) self.more_info.unset_flag(flagname, format, version) self.touch() self._build_file_list('init') def get_comment(self, format, version=None): """ Retrieve the comment of a specific format/version of the document. @param format: the specific format for which the comment should be retrieved. @type format: string @param version: the specific version for which the comment should be retrieved. If not specified the last version will be used. @type version: integer @return: the comment. @rtype: string """ if version is None: version = self.get_latest_version() format = normalize_format(format) return self.more_info.get_comment(format, version) def get_description(self, format, version=None): """ Retrieve the description of a specific format/version of the document. @param format: the specific format for which the description should be retrieved. @type format: string @param version: the specific version for which the description should be retrieved. If not specified the last version will be used. @type version: integer @return: the description. @rtype: string """ if version is None: version = self.get_latest_version() format = normalize_format(format) return self.more_info.get_description(format, version) def hidden_p(self, format, version=None): """ Returns True if the file specified by the given format/version is hidden. @param format: the specific format for which the description should be retrieved. @type format: string @param version: the specific version for which the description should be retrieved. If not specified the last version will be used. @type version: integer @return: True if hidden. @rtype: bool """ if version is None: version = self.get_latest_version() return self.more_info.has_flag('HIDDEN', format, version) def get_docname(self): """ @return: the name of this document. @rtype: string """ return self.docname def get_base_dir(self): """ @return: the base directory on the local filesystem for this document (e.g. C{/soft/cdsweb/var/data/files/g0/123}) @rtype: string """ return self.basedir def get_type(self): """ @return: the type of this document. @rtype: string""" return self.doctype def get_recid(self): """ @return: the record id of the record to which this document is attached. @rtype: integer """ return self.recid def get_id(self): """ @return: the id of this document. @rtype: integer """ return self.id def pdf_a_p(self): """ @return: True if this document contains a PDF in PDF/A format. @rtype: bool""" return self.has_flag('PDF/A', 'pdf') def has_text(self, require_up_to_date=False, version=None): """ Return True if the text of this document has already been extracted. @param require_up_to_date: if True check the text was actually extracted after the most recent format of the given version. @type require_up_to_date: bool @param version: a version for which the text should have been extracted. If not specified the latest version is considered. @type version: integer @return: True if the text has already been extracted. @rtype: bool """ if version is None: version = self.get_latest_version() if os.path.exists(os.path.join(self.basedir, '.text;%i' % version)): if not require_up_to_date: return True else: docfiles = self.list_version_files(version) text_md = datetime.fromtimestamp(os.path.getmtime(os.path.join(self.basedir, '.text;%i' % version))) for docfile in docfiles: if text_md <= docfile.md: return False return True return False def get_file(self, format, version=""): """ Returns a L{BibDocFile} instance of this document corresponding to the specific format and version. @param format: the specific format. @type format: string @param version: the specific version for which the description should be retrieved. If not specified the last version will be used. @type version: integer @return: the L{BibDocFile} instance. @rtype: BibDocFile """ if version == "": docfiles = self.list_latest_files() else: version = int(version) docfiles = self.list_version_files(version) format = normalize_format(format) for docfile in docfiles: if (docfile.get_format()==format or not format): return docfile ## Let's skip the subformat specification and consider just the ## superformat superformat = get_superformat_from_format(format) for docfile in docfiles: if get_superformat_from_format(docfile.get_format()) == superformat: return docfile raise InvenioWebSubmitFileError, "No file called '%s' of format '%s', version '%s'" % (self.docname, format, version) def list_versions(self): """ @return: the list of existing version numbers for this document. @rtype: list of integer """ versions = [] for docfile in self.docfiles: if not docfile.get_version() in versions: versions.append(docfile.get_version()) versions.sort() return versions def delete(self): """ Delete this document. @see: L{undelete} for how to undelete the document. @raise InvenioWebSubmitFileError: in case of errors. """ try: today = datetime.today() self.change_name('DELETED-%s%s-%s' % (today.strftime('%Y%m%d%H%M%S'), today.microsecond, self.docname)) run_sql("UPDATE bibdoc SET status='DELETED' WHERE id=%s", (self.id,)) self.status = 'DELETED' except Exception, e: register_exception() raise InvenioWebSubmitFileError, "It's impossible to delete bibdoc %s: %s" % (self.id, e) def deleted_p(self): """ @return: True if this document has been deleted. @rtype: bool """ return self.status == 'DELETED' def empty_p(self): """ @return: True if this document is empty, i.e. it has no bibdocfile connected. @rtype: bool """ return len(self.docfiles) == 0 def undelete(self, previous_status=''): """ Undelete a deleted file (only if it was actually deleted via L{delete}). The previous C{status}, i.e. the restriction key can be provided. Otherwise the undeleted document will be public. @param previous_status: the previous status the should be restored. @type previous_status: string @raise InvenioWebSubmitFileError: in case of any error. """ bibrecdocs = BibRecDocs(self.recid) try: run_sql("UPDATE bibdoc SET status=%s WHERE id=%s AND status='DELETED'", (previous_status, self.id)) except Exception, e: raise InvenioWebSubmitFileError, "It's impossible to undelete bibdoc %s: %s" % (self.id, e) if self.docname.startswith('DELETED-'): try: # Let's remove DELETED-20080214144322- in front of the docname original_name = '-'.join(self.docname.split('-')[2:]) original_name = bibrecdocs.propose_unique_docname(original_name) self.change_name(original_name) except Exception, e: raise InvenioWebSubmitFileError, "It's impossible to restore the previous docname %s. %s kept as docname because: %s" % (original_name, self.docname, e) else: raise InvenioWebSubmitFileError, "Strange just undeleted docname isn't called DELETED-somedate-docname but %s" % self.docname def delete_file(self, format, version): """ Delete a specific format/version of this document on the filesystem. @param format: the particular format to be deleted. @type format: string @param version: the particular version to be deleted. @type version: integer @note: this operation is not reversible!""" try: afile = self.get_file(format, version) except InvenioWebSubmitFileError: return try: os.remove(afile.get_full_path()) + run_sql("DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=%s AND format=%s", (self.id, afile.get_version(), afile.get_format)) except OSError: pass self.touch() self._build_file_list() def get_history(self): """ @return: a human readable and parsable string that represent the history of this document. @rtype: string """ ret = [] hst = run_sql("""SELECT action, docname, docformat, docversion, docsize, docchecksum, doctimestamp FROM hstDOCUMENT WHERE id_bibdoc=%s ORDER BY doctimestamp ASC""", (self.id, )) for row in hst: ret.append("%s %s '%s', format: '%s', version: %i, size: %s, checksum: '%s'" % (row[6].strftime('%Y-%m-%d %H:%M:%S'), row[0], row[1], row[2], row[3], nice_size(row[4]), row[5])) return ret def _build_file_list(self, context=''): """ Lists all files attached to the bibdoc. This function should be called everytime the bibdoc is modified. As a side effect it log everything that has happened to the bibdocfiles in the log facility, according to the context: "init": means that the function has been called; for the first time by a constructor, hence no logging is performed "": by default means to log every deleted file as deleted and every added file as added; "rename": means that every appearently deleted file is logged as renamef and every new file as renamet. """ def log_action(action, docid, docname, format, version, size, checksum, timestamp=''): """Log an action into the bibdoclog table.""" try: if timestamp: run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, docformat, docversion, docsize, docchecksum, doctimestamp) VALUES(%s, %s, %s, %s, %s, %s, %s, %s)', (action, docid, docname, format, version, size, checksum, timestamp)) else: run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, docformat, docversion, docsize, docchecksum, doctimestamp) VALUES(%s, %s, %s, %s, %s, %s, %s, NOW())', (action, docid, docname, format, version, size, checksum)) except DatabaseError: register_exception() def make_removed_added_bibdocfiles(previous_file_list): """Internal function for build the log of changed files.""" # Let's rebuild the previous situation old_files = {} for bibdocfile in previous_file_list: old_files[(bibdocfile.name, bibdocfile.format, bibdocfile.version)] = (bibdocfile.size, bibdocfile.checksum, bibdocfile.md) # Let's rebuild the new situation new_files = {} for bibdocfile in self.docfiles: new_files[(bibdocfile.name, bibdocfile.format, bibdocfile.version)] = (bibdocfile.size, bibdocfile.checksum, bibdocfile.md) # Let's subtract from added file all the files that are present in # the old list, and let's add to deleted files that are not present # added file. added_files = dict(new_files) deleted_files = {} for key, value in old_files.iteritems(): if added_files.has_key(key): del added_files[key] else: deleted_files[key] = value return (added_files, deleted_files) - if context != 'init': + if context != ('init', 'init_from_disk'): previous_file_list = list(self.docfiles) res = run_sql("SELECT status,docname,creation_date," "modification_date,more_info FROM bibdoc WHERE id=%s", (self.id,)) self.cd = res[0][2] self.md = res[0][3] self.docname = res[0][1] self.status = res[0][0] self.more_info = BibDocMoreInfo(self.id, blob_to_string(res[0][4])) self.docfiles = [] - if os.path.exists(self.basedir): - self.md5s = Md5Folder(self.basedir) - files = os.listdir(self.basedir) - files.sort() - for afile in files: - if not afile.startswith('.'): - try: - filepath = os.path.join(self.basedir, afile) - dirname, basename, format, fileversion = decompose_file_with_version(filepath) - checksum = self.md5s.get_checksum(afile) - # we can append file: - self.docfiles.append(BibDocFile(filepath, self.doctype, - fileversion, basename, format, - self.recid, self.id, self.status, checksum, - self.more_info, human_readable=self.human_readable)) - except Exception, e: - register_exception() - if context == 'init': + if CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE and context == 'init': + ## In normal init context we read from DB + res = run_sql("SELECT version, format, cd, md, checksum, filesize, FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.id, )) + for version, format, cd, md, checksum, size in res: + self.docfiles.append(BibDocFile( + os.path.join(self.basedir, self.docname + format + ";%s" % version), self.doctype, + version, self.docname, format, self.recid, self.id, self.status, checksum, + self.more_info, human_readable=self.human_readable, cd=cd, md=md, size=size)) + else: + if os.path.exists(self.basedir): + self.md5s = Md5Folder(self.basedir) + files = os.listdir(self.basedir) + files.sort() + for afile in files: + if not afile.startswith('.'): + try: + filepath = os.path.join(self.basedir, afile) + dirname, basename, format, fileversion = decompose_file_with_version(filepath) + checksum = self.md5s.get_checksum(afile) + # we can append file: + self.docfiles.append(BibDocFile(filepath, self.doctype, + fileversion, basename, format, + self.recid, self.id, self.status, checksum, + self.more_info, human_readable=self.human_readable)) + except Exception, e: + register_exception() + if context in ('init', 'init_from_disk'): return else: added_files, deleted_files = make_removed_added_bibdocfiles(previous_file_list) deletedstr = "DELETED" addedstr = "ADDED" if context == 'rename': deletedstr = "RENAMEDFROM" addedstr = "RENAMEDTO" for (docname, format, version), (size, checksum, md) in added_files.iteritems(): if context == 'rename': md = '' # No modification time log_action(addedstr, self.id, docname, format, version, size, checksum, md) for (docname, format, version), (size, checksum, md) in deleted_files.iteritems(): if context == 'rename': md = '' # No modification time log_action(deletedstr, self.id, docname, format, version, size, checksum, md) + def _sync_to_db(self): + """ + Update the content of the bibdocfile table by taking what is available on the filesystem. + """ + self._build_file_list('init_from_disk') + run_sql("DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.id,)) + for afile in self.docfiles: + run_sql("INSERT INTO bibdocfsinfo(id_bibdoc, version, format, last_version, cd, md, checksum, filesize, mime) VALUES(%s, %s, %s, false, %s, %s, %s, %s, %s)", (self.id, afile.get_version(), afile.get_format(), afile.cd, afile.md, afile.get_checksum(), afile.get_size(), afile.mime)) + run_sql("UPDATE bibdocfsinfo SET last_version=true WHERE id_bibdoc=%s AND version=%s", (self.id, self.get_latest_version())) + def get_total_size_latest_version(self): """Return the total size used on disk of all the files belonging to this bibdoc and corresponding to the latest version.""" ret = 0 for bibdocfile in self.list_latest_files(): ret += bibdocfile.get_size() return ret def get_total_size(self): """Return the total size used on disk of all the files belonging to this bibdoc.""" ret = 0 for bibdocfile in self.list_all_files(): ret += bibdocfile.get_size() return ret def list_all_files(self, list_hidden=True): """Returns all the docfiles linked with the given bibdoc.""" if list_hidden: return self.docfiles else: return [afile for afile in self.docfiles if not afile.hidden_p()] def list_latest_files(self, list_hidden=True): """Returns all the docfiles within the last version.""" return self.list_version_files(self.get_latest_version(), list_hidden=list_hidden) def list_version_files(self, version, list_hidden=True): """Return all the docfiles of a particular version.""" version = int(version) return [docfile for docfile in self.docfiles if docfile.get_version() == version and (list_hidden or not docfile.hidden_p())] def check_file_exists(self, path): """ Check if a file with the same content of the file pointed in C{path} is already attached to this record. @param path: the file to be checked against. @type path: string @return: True if a file with the requested content is already attached to the record. @rtype: bool """ # Let's consider all the latest files for afile in self.list_latest_files(): if afile.is_identical_to(path): return True return False def get_latest_version(self): """ Returns the latest existing version number for the given bibdoc. If no file is associated to this bibdoc, returns '0'. """ version = 0 for bibdocfile in self.docfiles: if bibdocfile.get_version() > version: version = bibdocfile.get_version() return version def get_file_number(self): """Return the total number of files.""" return len(self.docfiles) def register_download(self, ip_address, version, format, userid=0): """Register the information about a download of a particular file.""" format = normalize_format(format) if format[:1] == '.': format = format[1:] format = format.upper() return run_sql("INSERT DELAYED INTO rnkDOWNLOADS " "(id_bibrec,id_bibdoc,file_version,file_format," "id_user,client_host,download_time) VALUES " "(%s,%s,%s,%s,%s,INET_ATON(%s),NOW())", (self.recid, self.id, version, format, userid, ip_address,)) def generic_path2bidocfile(fullpath): """ Returns a BibDocFile objects that wraps the given fullpath. @note: the object will contain the minimum information that can be guessed from the fullpath (e.g. docname, format, subformat, version, md5, creation_date, modification_date). It won't contain for example a comment, a description, a doctype, a restriction. """ fullpath = os.path.abspath(fullpath) try: path, name, format, version = decompose_file_with_version(fullpath) except ValueError: ## There is no version version = 0 path, name, format = decompose_file(fullpath) md5folder = Md5Folder(path) checksum = md5folder.get_checksum(os.path.basename(fullpath)) return BibDocFile(fullpath=fullpath, doctype=None, version=version, name=name, format=format, recid=0, docid=0, status=None, checksum=checksum, more_info=None) class BibDocFile: """This class represents a physical file in the Invenio filesystem. It should never be instantiated directly""" - def __init__(self, fullpath, doctype, version, name, format, recid, docid, status, checksum, more_info=None, human_readable=False): + def __init__(self, fullpath, doctype, version, name, format, recid, docid, status, checksum, more_info=None, human_readable=False, cd=None, md=None, size=None): self.fullpath = os.path.abspath(fullpath) self.doctype = doctype self.docid = docid self.recid = recid self.version = version self.status = status self.checksum = checksum self.human_readable = human_readable if more_info: self.description = more_info.get_description(format, version) self.comment = more_info.get_comment(format, version) self.flags = more_info.get_flags(format, version) else: self.description = None self.comment = None self.flags = [] self.format = normalize_format(format) self.superformat = get_superformat_from_format(self.format) self.subformat = get_subformat_from_format(self.format) - if format == "": + self.fullname = name + if format: + self.fullname += self.superformat + self.mime, self.encoding = _mimes.guess_type(self.fullname) + if self.mime is None: self.mime = "application/octet-stream" - self.encoding = "" - self.fullname = name - else: - self.fullname = "%s%s" % (name, self.superformat) - (self.mime, self.encoding) = _mimes.guess_type(self.fullname) - if self.mime is None: - self.mime = "application/octet-stream" self.more_info = more_info self.hidden = 'HIDDEN' in self.flags - self.size = os.path.getsize(fullpath) - self.md = datetime.fromtimestamp(os.path.getmtime(fullpath)) + self.size = size or os.path.getsize(fullpath) + self.md = md or datetime.fromtimestamp(os.path.getmtime(fullpath)) try: - self.cd = datetime.fromtimestamp(os.path.getctime(fullpath)) + self.cd = cd or datetime.fromtimestamp(os.path.getctime(fullpath)) except OSError: self.cd = self.md self.name = name self.dir = os.path.dirname(fullpath) if self.subformat: self.url = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, self.name, self.superformat), {'subformat' : self.subformat}) self.fullurl = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, self.name, self.superformat), {'subformat' : self.subformat, 'version' : self.version}) else: self.url = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, self.name, self.superformat), {}) self.fullurl = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, self.name, self.superformat), {'version' : self.version}) self.etag = '"%i%s%i"' % (self.docid, self.format, self.version) self.magic = None def __repr__(self): return ('BibDocFile(%s, %s, %i, %s, %s, %i, %i, %s, %s, %s, %s)' % (repr(self.fullpath), repr(self.doctype), self.version, repr(self.name), repr(self.format), self.recid, self.docid, repr(self.status), repr(self.checksum), repr(self.more_info), repr(self.human_readable))) def __str__(self): out = '%s:%s:%s:%s:fullpath=%s\n' % (self.recid, self.docid, self.version, self.format, self.fullpath) out += '%s:%s:%s:%s:fullname=%s\n' % (self.recid, self.docid, self.version, self.format, self.fullname) out += '%s:%s:%s:%s:name=%s\n' % (self.recid, self.docid, self.version, self.format, self.name) out += '%s:%s:%s:%s:subformat=%s\n' % (self.recid, self.docid, self.version, self.format, get_subformat_from_format(self.format)) out += '%s:%s:%s:%s:status=%s\n' % (self.recid, self.docid, self.version, self.format, self.status) out += '%s:%s:%s:%s:checksum=%s\n' % (self.recid, self.docid, self.version, self.format, self.checksum) if self.human_readable: out += '%s:%s:%s:%s:size=%s\n' % (self.recid, self.docid, self.version, self.format, nice_size(self.size)) else: out += '%s:%s:%s:%s:size=%s\n' % (self.recid, self.docid, self.version, self.format, self.size) out += '%s:%s:%s:%s:creation time=%s\n' % (self.recid, self.docid, self.version, self.format, self.cd) out += '%s:%s:%s:%s:modification time=%s\n' % (self.recid, self.docid, self.version, self.format, self.md) out += '%s:%s:%s:%s:magic=%s\n' % (self.recid, self.docid, self.version, self.format, self.get_magic()) out += '%s:%s:%s:%s:mime=%s\n' % (self.recid, self.docid, self.version, self.format, self.mime) out += '%s:%s:%s:%s:encoding=%s\n' % (self.recid, self.docid, self.version, self.format, self.encoding) out += '%s:%s:%s:%s:url=%s\n' % (self.recid, self.docid, self.version, self.format, self.url) out += '%s:%s:%s:%s:fullurl=%s\n' % (self.recid, self.docid, self.version, self.format, self.fullurl) out += '%s:%s:%s:%s:description=%s\n' % (self.recid, self.docid, self.version, self.format, self.description) out += '%s:%s:%s:%s:comment=%s\n' % (self.recid, self.docid, self.version, self.format, self.comment) out += '%s:%s:%s:%s:hidden=%s\n' % (self.recid, self.docid, self.version, self.format, self.hidden) out += '%s:%s:%s:%s:flags=%s\n' % (self.recid, self.docid, self.version, self.format, self.flags) out += '%s:%s:%s:%s:etag=%s\n' % (self.recid, self.docid, self.version, self.format, self.etag) return out def display(self, ln = CFG_SITE_LANG): """Returns a formatted representation of this docfile.""" return websubmit_templates.tmpl_bibdocfile_filelist( ln = ln, recid = self.recid, version = self.version, md = self.md, name = self.name, superformat = self.superformat, subformat = self.subformat, nice_size = nice_size(self.size), description = self.description or '' ) def is_identical_to(self, path): """ @path: the path of another file on disk. @return: True if L{path} is contains bitwise the same content. """ if os.path.getsize(path) != self.size: return False if calculate_md5(path) != self.checksum: return False return filecmp.cmp(self.get_full_path(), path) def is_restricted(self, user_info): """Returns restriction state. (see acc_authorize_action return values)""" if self.status not in ('', 'DELETED'): return check_bibdoc_authorization(user_info, status=self.status) elif self.status == 'DELETED': return (1, 'File has ben deleted') else: return (0, '') def is_icon(self, subformat_re=CFG_WEBSUBMIT_ICON_SUBFORMAT_RE): """ @param subformat_re: by default the convention is that L{CFG_WEBSUBMIT_ICON_SUBFORMAT_RE} is used as a subformat indicator to mean that a particular format is to be used as an icon. Specifiy a different subformat if you need to use a different convention. @type subformat: compiled regular expression @return: True if this file is an icon. @rtype: bool """ return bool(subformat_re.match(self.subformat)) def hidden_p(self): return self.hidden def get_url(self): return self.url def get_type(self): return self.doctype def get_path(self): return self.fullpath def get_bibdocid(self): return self.docid def get_name(self): return self.name def get_full_name(self): return self.fullname def get_full_path(self): return self.fullpath def get_format(self): return self.format def get_subformat(self): return self.subformat def get_superformat(self): return self.superformat def get_size(self): return self.size def get_version(self): return self.version def get_checksum(self): return self.checksum def get_description(self): return self.description def get_comment(self): return self.comment def get_content(self): """Returns the binary content of the file.""" content_fd = open(self.fullpath, 'rb') content = content_fd.read() content_fd.close() return content def get_recid(self): """Returns the recid connected with the bibdoc of this file.""" return self.recid def get_status(self): """Returns the status of the file, i.e. either '', 'DELETED' or a restriction keyword.""" return self.status def get_magic(self): """Return all the possible guesses from the magic library about the content of the file.""" if self.magic is None and CFG_HAS_MAGIC: magic_cookies = _get_magic_cookies() magic_result = [] for key in magic_cookies.keys(): magic_result.append(magic_cookies[key].file(self.fullpath)) self.magic = tuple(magic_result) return self.magic def check(self): """Return True if the checksum corresponds to the file.""" return calculate_md5(self.fullpath) == self.checksum def stream(self, req, download=False): """Stream the file. Note that no restriction check is being done here, since restrictions have been checked previously inside websubmit_webinterface.py.""" if os.path.exists(self.fullpath): if random.random() < CFG_BIBDOCFILE_MD5_CHECK_PROBABILITY and calculate_md5(self.fullpath) != self.checksum: raise InvenioWebSubmitFileError, "File %s, version %i, for record %s is corrupted!" % (self.fullname, self.version, self.recid) stream_file(req, self.fullpath, "%s%s" % (self.name, self.superformat), self.mime, self.encoding, self.etag, self.checksum, self.fullurl, download=download) raise apache.SERVER_RETURN, apache.DONE else: req.status = apache.HTTP_NOT_FOUND raise InvenioWebSubmitFileError, "%s does not exists!" % self.fullpath _RE_STATUS_PARSER = re.compile(r'^(?Pemail|group|egroup|role|firerole|status):\s*(?P.*)$', re.S + re.I) def check_bibdoc_authorization(user_info, status): """ Check if the user is authorized to access a document protected with the given status. L{status} is a string of the form:: auth_type: auth_value where C{auth_type} can have values in:: email, group, role, firerole, status and C{auth_value} has a value interpreted againsta C{auth_type}: - C{email}: the user can access the document if his/her email matches C{auth_value} - C{group}: the user can access the document if one of the groups (local or external) of which he/she is member matches C{auth_value} - C{role}: the user can access the document if he/she belongs to the WebAccess role specified in C{auth_value} - C{firerole}: the user can access the document if he/she is implicitly matched by the role described by the firewall like role definition in C{auth_value} - C{status}: the user can access the document if he/she is authorized to for the action C{viewrestrdoc} with C{status} paramter having value C{auth_value} @note: If no C{auth_type} is specified or if C{auth_type} is not one of the above, C{auth_value} will be set to the value contained in the parameter C{status}, and C{auth_type} will be considered to be C{status}. @param user_info: the user_info dictionary @type: dict @param status: the status of the document. @type status: string @return: a tuple, of the form C{(auth_code, auth_message)} where auth_code is 0 if the authorization is granted and greater than 0 otherwise. @rtype: (int, string) @raise ValueError: in case of unexpected parsing error. """ def parse_status(status): g = _RE_STATUS_PARSER.match(status) if g: return (g.group('type').lower(), g.group('value')) else: return ('status', status) if acc_is_user_in_role(user_info, acc_get_role_id(SUPERADMINROLE)): return (0, CFG_WEBACCESS_WARNING_MSGS[0]) auth_type, auth_value = parse_status(status) if auth_type == 'status': return acc_authorize_action(user_info, 'viewrestrdoc', status=auth_value) elif auth_type == 'email': if not auth_value.lower().strip() == user_info['email'].lower().strip(): return (1, 'You must be member of the group %s in order to access this document' % repr(auth_value)) elif auth_type == 'group': if not auth_value in user_info['group']: return (1, 'You must be member of the group %s in order to access this document' % repr(auth_value)) elif auth_type == 'role': if not acc_is_user_in_role(user_info, acc_get_role_id(auth_value)): return (1, 'You must be member in the role %s in order to access this document' % repr(auth_value)) elif auth_type == 'firerole': if not acc_firerole_check_user(user_info, compile_role_definition(auth_value)): return (1, 'You must be authorized in order to access this document') else: raise ValueError, 'Unexpected authorization type %s for %s' % (repr(auth_type), repr(auth_value)) return (0, CFG_WEBACCESS_WARNING_MSGS[0]) _RE_BAD_MSIE = re.compile("MSIE\s+(\d+\.\d+)") def stream_file(req, fullpath, fullname=None, mime=None, encoding=None, etag=None, md5=None, location=None, download=False): """This is a generic function to stream a file to the user. If fullname, mime, encoding, and location are not provided they will be guessed based on req and fullpath. md5 should be passed as an hexadecimal string. """ def normal_streaming(size): req.set_content_length(size) req.send_http_header() if not req.header_only: req.sendfile(fullpath) return "" def single_range(size, the_range): req.set_content_length(the_range[1]) req.headers_out['Content-Range'] = 'bytes %d-%d/%d' % (the_range[0], the_range[0] + the_range[1] - 1, size) req.status = apache.HTTP_PARTIAL_CONTENT req.send_http_header() if not req.header_only: req.sendfile(fullpath, the_range[0], the_range[1]) return "" def multiple_ranges(size, ranges, mime): req.status = apache.HTTP_PARTIAL_CONTENT boundary = '%s%04d' % (time.strftime('THIS_STRING_SEPARATES_%Y%m%d%H%M%S'), random.randint(0, 9999)) req.content_type = 'multipart/byteranges; boundary=%s' % boundary content_length = 0 for arange in ranges: content_length += len('--%s\r\n' % boundary) content_length += len('Content-Type: %s\r\n' % mime) content_length += len('Content-Range: bytes %d-%d/%d\r\n' % (arange[0], arange[0] + arange[1] - 1, size)) content_length += len('\r\n') content_length += arange[1] content_length += len('\r\n') content_length += len('--%s--\r\n' % boundary) req.set_content_length(content_length) req.send_http_header() if not req.header_only: for arange in ranges: req.write('--%s\r\n' % boundary, 0) req.write('Content-Type: %s\r\n' % mime, 0) req.write('Content-Range: bytes %d-%d/%d\r\n' % (arange[0], arange[0] + arange[1] - 1, size), 0) req.write('\r\n', 0) req.sendfile(fullpath, arange[0], arange[1]) req.write('\r\n', 0) req.write('--%s--\r\n' % boundary) req.flush() return "" def parse_date(date): """According to a date can come in three formats (in order of preference): Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format Moreover IE is adding some trailing information after a ';'. Wrong dates should be simpled ignored. This function return the time in seconds since the epoch GMT or None in case of errors.""" if not date: return None try: date = date.split(';')[0].strip() # Because of IE ## Sun, 06 Nov 1994 08:49:37 GMT return time.mktime(time.strptime(date, '%a, %d %b %Y %X %Z')) except: try: ## Sun, 06 Nov 1994 08:49:37 GMT return time.mktime(time.strptime(date, '%A, %d-%b-%y %H:%M:%S %Z')) except: try: ## Sun, 06 Nov 1994 08:49:37 GMT return time.mktime(date) except: return None def parse_ranges(ranges): """According to a (multiple) range request comes in the form: bytes=20-30,40-60,70-,-80 with the meaning: from byte to 20 to 30 inclusive (11 bytes) from byte to 40 to 60 inclusive (21 bytes) from byte 70 to (size - 1) inclusive (size - 70 bytes) from byte size - 80 to (size - 1) inclusive (80 bytes) This function will return the list of ranges in the form: [[first_byte, last_byte], ...] If first_byte or last_byte aren't specified they'll be set to None If the list is not well formatted it will return None """ try: if ranges.startswith('bytes') and '=' in ranges: ranges = ranges.split('=')[1].strip() else: return None ret = [] for arange in ranges.split(','): arange = arange.strip() if arange.startswith('-'): ret.append([None, int(arange[1:])]) elif arange.endswith('-'): ret.append([int(arange[:-1]), None]) else: ret.append(map(int, arange.split('-'))) return ret except: return None def parse_tags(tags): """Return a list of tags starting from a comma separated list.""" return [tag.strip() for tag in tags.split(',')] def fix_ranges(ranges, size): """Complementary to parse_ranges it will transform all the ranges into (first_byte, length), adjusting all the value based on the actual size provided. """ ret = [] for arange in ranges: if (arange[0] is None and arange[1] > 0) or arange[0] < size: if arange[0] is None: arange[0] = size - arange[1] elif arange[1] is None: arange[1] = size - arange[0] else: arange[1] = arange[1] - arange[0] + 1 arange[0] = max(0, arange[0]) arange[1] = min(size - arange[0], arange[1]) if arange[1] > 0: ret.append(arange) return ret def get_normalized_headers(headers): """Strip and lowerize all the keys of the headers dictionary plus strip, lowerize and transform known headers value into their value.""" ret = { 'if-match' : None, 'unless-modified-since' : None, 'if-modified-since' : None, 'range' : None, 'if-range' : None, 'if-none-match' : None, } for key, value in req.headers_in.iteritems(): key = key.strip().lower() value = value.strip() if key in ('unless-modified-since', 'if-modified-since'): value = parse_date(value) elif key == 'range': value = parse_ranges(value) elif key == 'if-range': value = parse_date(value) or parse_tags(value) elif key in ('if-match', 'if-none-match'): value = parse_tags(value) if value: ret[key] = value return ret headers = get_normalized_headers(req.headers_in) g = _RE_BAD_MSIE.search(headers.get('user-agent', "MSIE 6.0")) bad_msie = g and float(g.group(1)) < 9.0 if CFG_BIBDOCFILE_USE_XSENDFILE: ## If XSendFile is supported by the server, let's use it. if os.path.exists(fullpath): if fullname is None: fullname = os.path.basename(fullpath) if bad_msie: ## IE is confused by quotes req.headers_out["Content-Disposition"] = 'attachment; filename=%s' % fullname.replace('"', '\\"') elif download: req.headers_out["Content-Disposition"] = 'attachment; filename="%s"' % fullname.replace('"', '\\"') else: ## IE is confused by inline req.headers_out["Content-Disposition"] = 'inline; filename="%s"' % fullname.replace('"', '\\"') req.headers_out["X-Sendfile"] = fullpath if mime is None: format = decompose_file(fullpath)[2] (mime, encoding) = _mimes.guess_type(fullpath) if mime is None: mime = "application/octet-stream" if not bad_msie: ## IE is confused by not supported mimetypes req.content_type = mime return "" else: raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND if headers['if-match']: if etag is not None and etag not in headers['if-match']: raise apache.SERVER_RETURN, apache.HTTP_PRECONDITION_FAILED if os.path.exists(fullpath): mtime = os.path.getmtime(fullpath) if fullname is None: fullname = os.path.basename(fullpath) if mime is None: (mime, encoding) = _mimes.guess_type(fullpath) if mime is None: mime = "application/octet-stream" if location is None: location = req.uri if not bad_msie: ## IE is confused by not supported mimetypes req.content_type = mime req.encoding = encoding req.filename = fullname req.headers_out["Last-Modified"] = time.strftime('%a, %d %b %Y %X GMT', time.gmtime(mtime)) if CFG_ENABLE_HTTP_RANGE_REQUESTS: req.headers_out["Accept-Ranges"] = "bytes" else: req.headers_out["Accept-Ranges"] = "none" req.headers_out["Content-Location"] = location if etag is not None: req.headers_out["ETag"] = etag if md5 is not None: req.headers_out["Content-MD5"] = base64.encodestring(binascii.unhexlify(md5.upper()))[:-1] if bad_msie: ## IE is confused by quotes req.headers_out["Content-Disposition"] = 'attachment; filename=%s' % fullname.replace('"', '\\"') elif download: req.headers_out["Content-Disposition"] = 'attachment; filename="%s"' % fullname.replace('"', '\\"') else: ## IE is confused by inline req.headers_out["Content-Disposition"] = 'inline; filename="%s"' % fullname.replace('"', '\\"') size = os.path.getsize(fullpath) if not size: try: raise Exception, '%s exists but is empty' % fullpath except Exception: register_exception(req=req, alert_admin=True) raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND if headers['if-modified-since'] and headers['if-modified-since'] >= mtime: raise apache.SERVER_RETURN, apache.HTTP_NOT_MODIFIED if headers['if-none-match']: if etag is not None and etag in headers['if-none-match']: raise apache.SERVER_RETURN, apache.HTTP_NOT_MODIFIED if headers['unless-modified-since'] and headers['unless-modified-since'] < mtime: return normal_streaming(size) if CFG_ENABLE_HTTP_RANGE_REQUESTS and headers['range']: try: if headers['if-range']: if etag is None or etag not in headers['if-range']: return normal_streaming(size) ranges = fix_ranges(headers['range'], size) except: return normal_streaming(size) if len(ranges) > 1: return multiple_ranges(size, ranges, mime) elif ranges: return single_range(size, ranges[0]) else: raise apache.SERVER_RETURN, apache.HTTP_RANGE_NOT_SATISFIABLE else: return normal_streaming(size) else: raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND def stream_restricted_icon(req): """Return the content of the "Restricted Icon" file.""" stream_file(req, '%s/img/restricted.gif' % CFG_WEBDIR) raise apache.SERVER_RETURN, apache.DONE def list_types_from_array(bibdocs): """Retrieves the list of types from the given bibdoc list.""" types = [] for bibdoc in bibdocs: if not bibdoc.get_type() in types: types.append(bibdoc.get_type()) types.sort() if 'Main' in types: ## Move 'Main' at the beginning types.remove('Main') types.insert(0, 'Main') return types def list_versions_from_array(docfiles): """Retrieve the list of existing versions from the given docfiles list.""" versions = [] for docfile in docfiles: if not docfile.get_version() in versions: versions.append(docfile.get_version()) versions.sort() versions.reverse() return versions def _make_base_dir(docid): """Given a docid it returns the complete path that should host its files.""" group = "g" + str(int(int(docid) / CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT)) return os.path.join(CFG_WEBSUBMIT_FILEDIR, group, str(docid)) class Md5Folder: """Manage all the Md5 checksum about a folder""" def __init__(self, folder): """Initialize the class from the md5 checksum of a given path""" self.folder = folder try: self.load() except InvenioWebSubmitFileError: self.md5s = {} self.update() def update(self, only_new = True): """Update the .md5 file with the current files. If only_new is specified then only not already calculated file are calculated.""" if not only_new: self.md5s = {} if os.path.exists(self.folder): for filename in os.listdir(self.folder): if filename not in self.md5s and not filename.startswith('.'): self.md5s[filename] = calculate_md5(os.path.join(self.folder, filename)) self.store() def store(self): """Store the current md5 dictionary into .md5""" try: old_umask = os.umask(022) md5file = open(os.path.join(self.folder, ".md5"), "w") for key, value in self.md5s.items(): md5file.write('%s *%s\n' % (value, key)) md5file.close() os.umask(old_umask) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while storing .md5 for folder '%s': '%s'" % (self.folder, e) def load(self): """Load .md5 into the md5 dictionary""" self.md5s = {} try: md5file = open(os.path.join(self.folder, ".md5"), "r") for row in md5file: md5hash = row[:32] filename = row[34:].strip() self.md5s[filename] = md5hash md5file.close() except IOError: self.update() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while loading .md5 for folder '%s': '%s'" % (self.folder, e) def check(self, filename = ''): """Check the specified file or all the files for which it exists a hash for being coherent with the stored hash.""" if filename and filename in self.md5s.keys(): try: return self.md5s[filename] == calculate_md5(os.path.join(self.folder, filename)) except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while loading '%s': '%s'" % (os.path.join(self.folder, filename), e) else: for filename, md5hash in self.md5s.items(): try: if calculate_md5(os.path.join(self.folder, filename)) != md5hash: return False except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while loading '%s': '%s'" % (os.path.join(self.folder, filename), e) return True def get_checksum(self, filename): """Return the checksum of a physical file.""" md5hash = self.md5s.get(filename, None) if md5hash is None: self.update() # Now it should not fail! md5hash = self.md5s[filename] return md5hash def calculate_md5_external(filename): """Calculate the md5 of a physical file through md5sum Command Line Tool. This is suitable for file larger than 256Kb.""" try: md5_result = os.popen(CFG_PATH_MD5SUM + ' -b %s' % escape_shell_arg(filename)) ret = md5_result.read()[:32] md5_result.close() if len(ret) != 32: # Error in running md5sum. Let's fallback to internal # algorithm. return calculate_md5(filename, force_internal=True) else: return ret except Exception, e: raise InvenioWebSubmitFileError, "Encountered an exception while calculating md5 for file '%s': '%s'" % (filename, e) def calculate_md5(filename, force_internal=False): """Calculate the md5 of a physical file. This is suitable for files smaller than 256Kb.""" if not CFG_PATH_MD5SUM or force_internal or os.path.getsize(filename) < CFG_BIBDOCFILE_MD5_THRESHOLD: try: to_be_read = open(filename, "rb") computed_md5 = md5() while True: buf = to_be_read.read(CFG_BIBDOCFILE_MD5_BUFFER) if buf: computed_md5.update(buf) else: break to_be_read.close() return computed_md5.hexdigest() except Exception, e: register_exception() raise InvenioWebSubmitFileError, "Encountered an exception while calculating md5 for file '%s': '%s'" % (filename, e) else: return calculate_md5_external(filename) def bibdocfile_url_to_bibrecdocs(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns a BibRecDocs object for the corresponding recid.""" recid = decompose_bibdocfile_url(url)[0] return BibRecDocs(recid) def bibdocfile_url_to_bibdoc(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns a BibDoc object for the corresponding recid/docname.""" docname = decompose_bibdocfile_url(url)[1] return bibdocfile_url_to_bibrecdocs(url).get_bibdoc(docname) def bibdocfile_url_to_bibdocfile(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns a BibDocFile object for the corresponding recid/docname/format.""" dummy, dummy, format = decompose_bibdocfile_url(url) return bibdocfile_url_to_bibdoc(url).get_file(format) def bibdocfile_url_to_fullpath(url): """Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns the fullpath for the corresponding recid/docname/format.""" return bibdocfile_url_to_bibdocfile(url).get_full_path() def bibdocfile_url_p(url): """Return True when the url is a potential valid url pointing to a fulltext owned by a system.""" if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL): return True if not (url.startswith('%s/%s/' % (CFG_SITE_URL, CFG_SITE_RECORD)) or url.startswith('%s/%s/' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD))): return False splitted_url = url.split('/files/') return len(splitted_url) == 2 and splitted_url[0] != '' and splitted_url[1] != '' def get_docid_from_bibdocfile_fullpath(fullpath): """Given a bibdocfile fullpath (e.g. "CFG_WEBSUBMIT_FILEDIR/g0/123/bar.pdf;1") returns the docid (e.g. 123).""" if not fullpath.startswith(os.path.join(CFG_WEBSUBMIT_FILEDIR, 'g')): raise InvenioWebSubmitFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath dirname, base, extension, version = decompose_file_with_version(fullpath) try: return int(dirname.split('/')[-1]) except: raise InvenioWebSubmitFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath def decompose_bibdocfile_fullpath(fullpath): """Given a bibdocfile fullpath (e.g. "CFG_WEBSUBMIT_FILEDIR/g0/123/bar.pdf;1") returns a quadruple (recid, docname, format, version).""" if not fullpath.startswith(os.path.join(CFG_WEBSUBMIT_FILEDIR, 'g')): raise InvenioWebSubmitFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath dirname, base, extension, version = decompose_file_with_version(fullpath) try: docid = int(dirname.split('/')[-1]) bibdoc = BibDoc(docid) recid = bibdoc.get_recid() docname = bibdoc.get_docname() return recid, docname, extension, version except: raise InvenioWebSubmitFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath def decompose_bibdocfile_url(url): """Given a bibdocfile_url return a triple (recid, docname, format).""" if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL): return decompose_bibdocfile_very_old_url(url) if url.startswith('%s/%s/' % (CFG_SITE_URL, CFG_SITE_RECORD)): recid_file = url[len('%s/%s/' % (CFG_SITE_URL, CFG_SITE_RECORD)):] elif url.startswith('%s/%s/' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD)): recid_file = url[len('%s/%s/' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD)):] else: raise InvenioWebSubmitFileError, "Url %s doesn't correspond to a valid record inside the system." % url recid_file = recid_file.replace('/files/', '/') recid, docname, format = decompose_file(urllib.unquote(recid_file)) if not recid and docname.isdigit(): ## If the URL was something similar to CFG_SITE_URL/CFG_SITE_RECORD/123 return (int(docname), '', '') return (int(recid), docname, format) re_bibdocfile_old_url = re.compile(r'/%s/(\d*)/files/' % CFG_SITE_RECORD) def decompose_bibdocfile_old_url(url): """Given a bibdocfile old url (e.g. CFG_SITE_URL/CFG_SITE_RECORD/123/files) it returns the recid.""" g = re_bibdocfile_old_url.search(url) if g: return int(g.group(1)) raise InvenioWebSubmitFileError('%s is not a valid old bibdocfile url' % url) def decompose_bibdocfile_very_old_url(url): """Decompose an old /getfile.py? URL""" if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL): params = urllib.splitquery(url)[1] if params: try: params = cgi.parse_qs(params) if 'docid' in params: docid = int(params['docid'][0]) bibdoc = BibDoc(docid) recid = bibdoc.get_recid() docname = bibdoc.get_docname() elif 'recid' in params: recid = int(params['recid'][0]) if 'name' in params: docname = params['name'][0] else: docname = '' else: raise InvenioWebSubmitFileError('%s has not enough params to correspond to a bibdocfile.' % url) format = normalize_format(params.get('format', [''])[0]) return (recid, docname, format) except Exception, e: raise InvenioWebSubmitFileError('Problem with %s: %s' % (url, e)) else: raise InvenioWebSubmitFileError('%s has no params to correspond to a bibdocfile.' % url) else: raise InvenioWebSubmitFileError('%s is not a valid very old bibdocfile url' % url) def get_docname_from_url(url): """Return a potential docname given a url""" path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] filename = os.path.split(path)[-1] return file_strip_ext(filename) def get_format_from_url(url): """Return a potential format given a url""" path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] filename = os.path.split(path)[-1] return filename[len(file_strip_ext(filename)):] def clean_url(url): """Given a local url e.g. a local path it render it a realpath.""" if is_url_a_local_file(url): path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] return os.path.abspath(path) else: return url def is_url_a_local_file(url): """Return True if the given URL is pointing to a local file.""" protocol = urllib2.urlparse.urlsplit(url)[0] return protocol in ('', 'file') def check_valid_url(url): """ Check for validity of a url or a file. @param url: the URL to check @type url: string @raise StandardError: if the URL is not a valid URL. """ try: if is_url_a_local_file(url): path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2] if os.path.abspath(path) != path: raise StandardError, "%s is not a normalized path (would be %s)." % (path, os.path.normpath(path)) for allowed_path in CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS + [CFG_TMPDIR, CFG_TMPSHAREDDIR, CFG_WEBSUBMIT_STORAGEDIR]: if path.startswith(allowed_path): dummy_fd = open(path) dummy_fd.close() return raise StandardError, "%s is not in one of the allowed paths." % path else: try: open_url(url) except InvenioBibdocfileUnauthorizedURL, e: raise StandardError, str(e) except Exception, e: raise StandardError, "%s is not a correct url: %s" % (url, e) def safe_mkstemp(suffix, prefix='bibdocfile_'): """Create a temporary filename that don't have any '.' inside a part from the suffix.""" tmpfd, tmppath = tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=CFG_TMPDIR) # Close the file and leave the responsability to the client code to # correctly open/close it. os.close(tmpfd) if '.' not in suffix: # Just in case format is empty return tmppath while '.' in os.path.basename(tmppath)[:-len(suffix)]: os.remove(tmppath) tmpfd, tmppath = tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=CFG_TMPDIR) os.close(tmpfd) return tmppath def download_local_file(filename, format=None): """ Copies a local file to Invenio's temporary directory. @param filename: the name of the file to copy @type filename: string @param format: the format of the file to copy (will be found if not specified) @type format: string @return: the path of the temporary file created @rtype: string @raise StandardError: if something went wrong """ # Make sure the format is OK. if format is None: format = guess_format_from_url(filename) else: format = normalize_format(format) tmppath = '' # Now try to copy. try: path = urllib2.urlparse.urlsplit(urllib.unquote(filename))[2] if os.path.abspath(path) != path: raise StandardError, "%s is not a normalized path (would be %s)." \ % (path, os.path.normpath(path)) for allowed_path in CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS + [CFG_TMPDIR, CFG_WEBSUBMIT_STORAGEDIR]: if path.startswith(allowed_path): tmppath = safe_mkstemp(format) shutil.copy(path, tmppath) if os.path.getsize(tmppath) == 0: os.remove(tmppath) raise StandardError, "%s seems to be empty" % filename break else: raise StandardError, "%s is not in one of the allowed paths." % path except Exception, e: raise StandardError, "Impossible to copy the local file '%s': %s" % \ (filename, str(e)) return tmppath def download_external_url(url, format=None): """ Download a url (if it corresponds to a remote file) and return a local url to it. @param url: the URL to download @type url: string @param format: the format of the file (will be found if not specified) @type format: string @return: the path to the download local file @rtype: string @raise StandardError: if the download failed """ tmppath = None # Make sure the format is OK. if format is None: # First try to find a known extension to the URL format = decompose_file(url, skip_version=True, only_known_extensions=True)[2] if not format: # No correct format could be found. Will try to get it from the # HTTP message headers. format = '' else: format = normalize_format(format) from_file, to_file, tmppath = None, None, '' try: from_file = open_url(url) except InvenioBibdocfileUnauthorizedURL, e: raise StandardError, str(e) except urllib2.URLError, e: raise StandardError, 'URL could not be opened: %s' % str(e) if not format: # We could not determine the format from the URL, so let's try # to read it from the HTTP headers. format = get_format_from_http_response(from_file) try: tmppath = safe_mkstemp(format) to_file = open(tmppath, 'w') while True: block = from_file.read(CFG_BIBDOCFILE_BLOCK_SIZE) if not block: break to_file.write(block) to_file.close() from_file.close() if os.path.getsize(tmppath) == 0: raise StandardError, "%s seems to be empty" % url except Exception, e: # Try to close and remove the temporary file. try: to_file.close() except Exception: pass try: os.remove(tmppath) except Exception: pass raise StandardError, "Error when downloading %s into %s: %s" % \ (url, tmppath, e) return tmppath def get_format_from_http_response(response): """ Tries to retrieve the format of the file from the message headers of the HTTP response. @param response: the HTTP response @type response: file-like object (as returned by urllib.urlopen) @return: the format of the remote resource @rtype: string """ def parse_content_type(text): return text.split(';')[0].strip() def parse_content_disposition(text): for item in text.split(';'): item = item.strip() if item.strip().startswith('filename='): return item[len('filename="'):-len('"')] info = response.info() format = '' content_disposition = info.getheader('Content-Disposition') if content_disposition: filename = parse_content_disposition(content_disposition) if filename: format = decompose_file(filename)[2] content_type = info.getheader('Content-Type') if content_type: content_type = parse_content_type(content_type) ext = _mimes.guess_extension(content_type) if ext: format = normalize_format(ext) return format def download_url(url, format=None): """ Download a url (if it corresponds to a remote file) and return a local url to it. """ tmppath = None try: if is_url_a_local_file(url): tmppath = download_local_file(url, format=format) else: tmppath = download_external_url(url, format=format) except StandardError: raise return tmppath class BibDocMoreInfo: """ This class wraps contextual information of the documents, such as the - comments - descriptions - flags. Such information is kept separately per every format/version instance of the corresponding document and is searialized in the database, ready to be retrieved (but not searched). @param docid: the document identifier. @type docid: integer @param more_info: a serialized version of an already existing more_info object. If not specified this information will be readed from the database, and othewise an empty dictionary will be allocated. @raise ValueError: if docid is not a positive integer. @ivar docid: the document identifier as passed to the constructor. @type docid: integer @ivar more_info: the more_info dictionary that will hold all the additional document information. @type more_info: dict of dict of dict @note: in general this class is never instanciated in client code and never used outside bibdocfile module. @note: this class will be extended in the future to hold all the new auxiliary information about a document. """ def __init__(self, docid, more_info=None): if not (type(docid) in (long, int) and docid > 0): raise ValueError("docid is not a positive integer, but %s." % docid) self.docid = docid if more_info is None: res = run_sql('SELECT more_info FROM bibdoc WHERE id=%s', (docid, )) if res and res[0][0]: self.more_info = cPickle.loads(blob_to_string(res[0][0])) else: self.more_info = {} else: self.more_info = cPickle.loads(more_info) if 'descriptions' not in self.more_info: self.more_info['descriptions'] = {} if 'comments' not in self.more_info: self.more_info['comments'] = {} if 'flags' not in self.more_info: self.more_info['flags'] = {} def __repr__(self): """ @return: the canonical string representation of the C{BibDocMoreInfo}. @rtype: string """ return 'BibDocMoreInfo(%i, %s)' % (self.docid, repr(cPickle.dumps(self.more_info))) def flush(self): """ Flush this object to the database. """ run_sql('UPDATE bibdoc SET more_info=%s WHERE id=%s', (cPickle.dumps(self.more_info), self.docid)) def set_flag(self, flagname, format, version): """ Sets a flag. @param flagname: the flag to set (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}). @type flagname: string @param format: the format for which the flag should set. @type format: string @param version: the version for which the flag should set: @type version: integer @raise ValueError: if the flag is not in L{CFG_BIBDOCFILE_AVAILABLE_FLAGS} """ if flagname in CFG_BIBDOCFILE_AVAILABLE_FLAGS: if not flagname in self.more_info['flags']: self.more_info['flags'][flagname] = {} if not version in self.more_info['flags'][flagname]: self.more_info['flags'][flagname][version] = {} if not format in self.more_info['flags'][flagname][version]: self.more_info['flags'][flagname][version][format] = {} self.more_info['flags'][flagname][version][format] = True self.flush() else: raise ValueError, "%s is not in %s" % (flagname, CFG_BIBDOCFILE_AVAILABLE_FLAGS) def get_comment(self, format, version): """ Returns the specified comment. @param format: the format for which the comment should be retrieved. @type format: string @param version: the version for which the comment should be retrieved. @type version: integer @return: the specified comment. @rtype: string """ try: assert(type(version) is int) format = normalize_format(format) return self.more_info['comments'].get(version, {}).get(format) except: register_exception() raise def get_description(self, format, version): """ Returns the specified description. @param format: the format for which the description should be retrieved. @type format: string @param version: the version for which the description should be retrieved. @type version: integer @return: the specified description. @rtype: string """ try: assert(type(version) is int) format = normalize_format(format) return self.more_info['descriptions'].get(version, {}).get(format) except: register_exception() raise def has_flag(self, flagname, format, version): """ Return True if the corresponding has been set. @param flagname: the name of the flag (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}). @type flagname: string @param format: the format for which the flag should be checked. @type format: string @param version: the version for which the flag should be checked. @type version: integer @return: True if the flag is set for the given format/version. @rtype: bool @raise ValueError: if the flagname is not in L{CFG_BIBDOCFILE_AVAILABLE_FLAGS} """ if flagname in CFG_BIBDOCFILE_AVAILABLE_FLAGS: return self.more_info['flags'].get(flagname, {}).get(version, {}).get(format, False) else: raise ValueError, "%s is not in %s" % (flagname, CFG_BIBDOCFILE_AVAILABLE_FLAGS) def get_flags(self, format, version): """ Return the list of all the enabled flags. @param format: the format for which the list should be returned. @type format: string @param version: the version for which the list should be returned. @type version: integer @return: the list of enabled flags (from L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}). @rtype: list of string """ return [flag for flag in self.more_info['flags'] if format in self.more_info['flags'][flag].get(version, {})] def set_comment(self, comment, format, version): """ Set a comment. @param comment: the comment to be set. @type comment: string @param format: the format for which the comment should be set. @type format: string @param version: the version for which the comment should be set: @type version: integer """ try: assert(type(version) is int and version > 0) format = normalize_format(format) if comment == KEEP_OLD_VALUE: comment = self.get_comment(format, version) or self.get_comment(format, version - 1) if not comment: self.unset_comment(format, version) self.flush() return if not version in self.more_info['comments']: self.more_info['comments'][version] = {} self.more_info['comments'][version][format] = comment self.flush() except: register_exception() raise def set_description(self, description, format, version): """ Set a description. @param description: the description to be set. @type description: string @param format: the format for which the description should be set. @type format: string @param version: the version for which the description should be set: @type version: integer """ try: assert(type(version) is int and version > 0) format = normalize_format(format) if description == KEEP_OLD_VALUE: description = self.get_description(format, version) or self.get_description(format, version - 1) if not description: self.unset_description(format, version) self.flush() return if not version in self.more_info['descriptions']: self.more_info['descriptions'][version] = {} self.more_info['descriptions'][version][format] = description self.flush() except: register_exception() raise def unset_comment(self, format, version): """ Unset a comment. @param format: the format for which the comment should be unset. @type format: string @param version: the version for which the comment should be unset: @type version: integer """ try: assert(type(version) is int and version > 0) del self.more_info['comments'][version][format] self.flush() except KeyError: pass except: register_exception() raise def unset_description(self, format, version): """ Unset a description. @param format: the format for which the description should be unset. @type format: string @param version: the version for which the description should be unset: @type version: integer """ try: assert(type(version) is int and version > 0) del self.more_info['descriptions'][version][format] self.flush() except KeyError: pass except: register_exception() raise def unset_flag(self, flagname, format, version): """ Unset a flag. @param flagname: the flag to be unset (see L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}). @type flagname: string @param format: the format for which the flag should be unset. @type format: string @param version: the version for which the flag should be unset: @type version: integer @raise ValueError: if the flag is not in L{CFG_BIBDOCFILE_AVAILABLE_FLAGS} """ if flagname in CFG_BIBDOCFILE_AVAILABLE_FLAGS: try: del self.more_info['flags'][flagname][version][format] self.flush() except KeyError: pass else: raise ValueError, "%s is not in %s" % (flagname, CFG_BIBDOCFILE_AVAILABLE_FLAGS) def serialize(self): """ @return: the serialized version of this object. @rtype: string """ return cPickle.dumps(self.more_info) def readfile(filename): """ Read a file. @param filename: the name of the file to be read. @type filename: string @return: the text contained in the file. @rtype: string @note: Returns empty string in case of any error. @note: this function is useful for quick implementation of websubmit functions. """ try: return open(filename).read() except Exception: return '' class HeadRequest(urllib2.Request): """ A request object to perform a HEAD request. """ def get_method(self): return 'HEAD' def open_url(url, headers=None, head_request=False): """ Opens a URL. If headers are passed as argument, no check is performed and the URL will be opened. Otherwise checks if the URL is present in CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS and uses the headers specified in the config variable. @param url: the URL to open @type url: string @param headers: the headers to use @type headers: dictionary @param head_request: if True, perform a HEAD request, otherwise a POST request @type head_request: boolean @return: a file-like object as returned by urllib2.urlopen. """ headers_to_use = None if headers is None: for regex, headers in _CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS: if regex.match(url) is not None: headers_to_use = headers break if headers_to_use is None: # URL is not allowed. raise InvenioBibdocfileUnauthorizedURL, "%s is not an authorized " \ "external URL." % url else: headers_to_use = headers request_obj = head_request and HeadRequest or urllib2.Request request = request_obj(url) for key, value in headers_to_use.items(): request.add_header(key, value) return urllib2.urlopen(request) diff --git a/modules/websubmit/lib/bibdocfilecli.py b/modules/websubmit/lib/bibdocfilecli.py index 9230f9e69..b32385c2a 100644 --- a/modules/websubmit/lib/bibdocfilecli.py +++ b/modules/websubmit/lib/bibdocfilecli.py @@ -1,1140 +1,1190 @@ # -*- coding: utf-8 -*- ## ## This file is part of Invenio. ## Copyright (C) 2008, 2009, 2010, 2011 CERN. ## ## Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. """ BibDocAdmin CLI administration tool """ __revision__ = "$Id$" import sys import re import os import time import fnmatch import time from datetime import datetime from logging import getLogger, debug, DEBUG from optparse import OptionParser, OptionGroup, OptionValueError from tempfile import mkstemp from invenio.errorlib import register_exception from invenio.config import CFG_TMPDIR, CFG_SITE_URL, CFG_WEBSUBMIT_FILEDIR, \ CFG_SITE_RECORD, CFG_TMPSHAREDDIR from invenio.bibdocfile import BibRecDocs, BibDoc, InvenioWebSubmitFileError, \ nice_size, check_valid_url, clean_url, get_docname_from_url, \ guess_format_from_url, KEEP_OLD_VALUE, decompose_bibdocfile_fullpath, \ bibdocfile_url_to_bibdoc, decompose_bibdocfile_url, CFG_BIBDOCFILE_AVAILABLE_FLAGS from invenio.intbitset import intbitset from invenio.search_engine import perform_request_search from invenio.textutils import wrap_text_in_a_box, wait_for_user from invenio.dbquery import run_sql from invenio.bibtask import task_low_level_submission from invenio.textutils import encode_for_xml from invenio.websubmit_file_converter import can_perform_ocr def _xml_mksubfield(key, subfield, fft): return fft.get(key, None) is not None and '\t\t%s\n' % (subfield, encode_for_xml(str(fft[key]))) or '' def _xml_mksubfields(key, subfield, fft): ret = "" for value in fft.get(key, []): ret += '\t\t%s\n' % (subfield, encode_for_xml(str(value))) return ret def _xml_fft_creator(fft): """Transform an fft dictionary (made by keys url, docname, format, new_docname, comment, description, restriction, doctype, into an xml string.""" debug('Input FFT structure: %s' % fft) out = '\t\n' out += _xml_mksubfield('url', 'a', fft) out += _xml_mksubfield('docname', 'n', fft) out += _xml_mksubfield('format', 'f', fft) out += _xml_mksubfield('new_docname', 'm', fft) out += _xml_mksubfield('doctype', 't', fft) out += _xml_mksubfield('description', 'd', fft) out += _xml_mksubfield('comment', 'z', fft) out += _xml_mksubfield('restriction', 'r', fft) out += _xml_mksubfields('options', 'o', fft) out += _xml_mksubfield('version', 'v', fft) out += '\t\n' debug('FFT created: %s' % out) return out def ffts_to_xml(ffts_dict): """Transform a dictionary: recid -> ffts where ffts is a list of fft dictionary into xml. """ debug('Input FFTs dictionary: %s' % ffts_dict) out = '' recids = ffts_dict.keys() recids.sort() for recid in recids: ffts = ffts_dict[recid] if ffts: out += '\n' out += '\t%i\n' % recid for fft in ffts: out += _xml_fft_creator(fft) out += '\n' debug('MARC to Upload: %s' % out) return out _shift_re = re.compile("([-\+]{0,1})([\d]+)([dhms])") def _parse_datetime(var): """Returns a date string according to the format string. It can handle normal date strings and shifts with respect to now.""" if not var: return None date = time.time() factors = {"d":24*3600, "h":3600, "m":60, "s":1} m = _shift_re.match(var) if m: sign = m.groups()[0] == "-" and -1 or 1 factor = factors[m.groups()[2]] value = float(m.groups()[1]) return datetime.fromtimestamp(date + sign * factor * value) else: return datetime(*(time.strptime(var, "%Y-%m-%d %H:%M:%S")[0:6])) # The code above is Python 2.4 compatible. The following is the 2.5 # version. # return datetime.strptime(var, "%Y-%m-%d %H:%M:%S") def _parse_date_range(var): """Returns the two dates contained as a low,high tuple""" limits = var.split(",") if len(limits)==1: low = _parse_datetime(limits[0]) return low, None if len(limits)==2: low = _parse_datetime(limits[0]) high = _parse_datetime(limits[1]) return low, high return None, None def cli_quick_match_all_recids(options): """Return an quickly an approximate but (by excess) list of good recids.""" url = getattr(options, 'url', None) if url: return intbitset([decompose_bibdocfile_url(url)[0]]) path = getattr(options, 'path', None) if path: return intbitset([decompose_bibdocfile_fullpath(path)[0]]) collection = getattr(options, 'collection', None) pattern = getattr(options, 'pattern', None) recids = getattr(options, 'recids', None) md_rec = getattr(options, 'md_rec', None) cd_rec = getattr(options, 'cd_rec', None) tmp_date_query = [] tmp_date_params = [] if recids is None: debug('Initially considering all the recids') recids = intbitset(run_sql('SELECT id FROM bibrec')) if not recids: print >> sys.stderr, 'WARNING: No record in the database' if md_rec[0] is not None: tmp_date_query.append('modification_date>=%s') tmp_date_params.append(md_rec[0]) if md_rec[1] is not None: tmp_date_query.append('modification_date<=%s') tmp_date_params.append(md_rec[1]) if cd_rec[0] is not None: tmp_date_query.append('creation_date>=%s') tmp_date_params.append(cd_rec[0]) if cd_rec[1] is not None: tmp_date_query.append('creation_date<=%s') tmp_date_params.append(cd_rec[1]) if tmp_date_query: tmp_date_query = ' AND '.join(tmp_date_query) tmp_date_params = tuple(tmp_date_params) query = 'SELECT id FROM bibrec WHERE %s' % tmp_date_query debug('Query: %s, param: %s' % (query, tmp_date_params)) recids &= intbitset(run_sql(query % tmp_date_query, tmp_date_params)) debug('After applying dates we obtain recids: %s' % recids) if not recids: print >> sys.stderr, 'WARNING: Time constraints for records are too strict' if collection or pattern: recids &= intbitset(perform_request_search(cc=collection or '', p=pattern or '')) debug('After applyings pattern and collection we obtain recids: %s' % recids) debug('Quick recids: %s' % recids) return recids def cli_quick_match_all_docids(options, recids=None): """Return an quickly an approximate but (by excess) list of good docids.""" url = getattr(options, 'url', None) if url: return intbitset([bibdocfile_url_to_bibdoc(url).get_id()]) path = getattr(options, 'path', None) if path: return intbitset([decompose_bibdocfile_fullpath(path)[0]]) deleted_docs = getattr(options, 'deleted_docs', None) action_undelete = getattr(options, 'action', None) == 'undelete' docids = getattr(options, 'docids', None) md_doc = getattr(options, 'md_doc', None) cd_doc = getattr(options, 'cd_doc', None) if docids is None: debug('Initially considering all the docids') if recids is None: recids = cli_quick_match_all_recids(options) docids = intbitset() for id_bibrec, id_bibdoc in run_sql('SELECT id_bibrec, id_bibdoc FROM bibrec_bibdoc'): if id_bibrec in recids: docids.add(id_bibdoc) else: debug('Initially considering this docids: %s' % docids) tmp_query = [] tmp_params = [] if deleted_docs is None and action_undelete: deleted_docs = 'only' if deleted_docs == 'no': tmp_query.append('status<>"DELETED"') elif deleted_docs == 'only': tmp_query.append('status="DELETED"') if md_doc[0] is not None: tmp_query.append('modification_date>=%s') tmp_params.append(md_doc[0]) if md_doc[1] is not None: tmp_query.append('modification_date<=%s') tmp_params.append(md_doc[1]) if cd_doc[0] is not None: tmp_query.append('creation_date>=%s') tmp_params.append(cd_doc[0]) if cd_doc[1] is not None: tmp_query.append('creation_date<=%s') tmp_params.append(cd_doc[1]) if tmp_query: tmp_query = ' AND '.join(tmp_query) tmp_params = tuple(tmp_params) query = 'SELECT id FROM bibdoc WHERE %s' % tmp_query debug('Query: %s, param: %s' % (query, tmp_params)) docids &= intbitset(run_sql(query, tmp_params)) debug('After applying dates we obtain docids: %s' % docids) return docids def cli_slow_match_single_recid(options, recid, recids=None, docids=None): """Apply all the given queries in order to assert wethever a recid match or not. if with_docids is True, the recid is matched if it has at least one docid that is matched""" debug('cli_slow_match_single_recid checking: %s' % recid) deleted_docs = getattr(options, 'deleted_docs', None) deleted_recs = getattr(options, 'deleted_recs', None) empty_recs = getattr(options, 'empty_recs', None) docname = cli2docname(options) bibrecdocs = BibRecDocs(recid, deleted_too=(deleted_docs != 'no')) if bibrecdocs.deleted_p() and (deleted_recs == 'no'): return False elif not bibrecdocs.deleted_p() and (deleted_recs != 'only'): if docids: for bibdoc in bibrecdocs.list_bibdocs(): if bibdoc.get_id() in docids: break else: return False if docname: for other_docname in bibrecdocs.get_bibdoc_names(): if docname and fnmatch.fnmatchcase(other_docname, docname): break else: return False if bibrecdocs.empty_p() and (empty_recs != 'no'): return True elif not bibrecdocs.empty_p() and (empty_recs != 'only'): return True return False def cli_slow_match_single_docid(options, docid, recids=None, docids=None): """Apply all the given queries in order to assert wethever a recid match or not.""" debug('cli_slow_match_single_docid checking: %s' % docid) empty_docs = getattr(options, 'empty_docs', None) docname = cli2docname(options) if recids is None: recids = cli_quick_match_all_recids(options) bibdoc = BibDoc(docid) if docname and not fnmatch.fnmatchcase(bibdoc.get_docname(), docname): debug('docname %s does not match the pattern %s' % (repr(bibdoc.get_docname()), repr(docname))) return False elif bibdoc.get_recid() and bibdoc.get_recid() not in recids: debug('recid %s is not in pattern %s' % (repr(bibdoc.get_recid()), repr(recids))) return False elif empty_docs == 'no' and bibdoc.empty_p(): debug('bibdoc is empty') return False elif empty_docs == 'only' and not bibdoc.empty_p(): debug('bibdoc is not empty') return False else: return True def cli2recid(options, recids=None, docids=None): """Given the command line options return a recid.""" recids = list(cli_recids_iterator(options, recids=recids, docids=docids)) if len(recids) == 1: return recids[0] if recids: raise StandardError, "More than one recid has been matched: %s" % recids else: raise StandardError, "No recids matched" def cli2docid(options, recids=None, docids=None): """Given the command line options return a docid.""" docids = list(cli_docids_iterator(options, recids=recids, docids=docids)) if len(docids) == 1: return docids[0] if docids: raise StandardError, "More than one docid has been matched: %s" % docids else: raise StandardError, "No docids matched" def cli2flags(options): """ Transform a comma separated list of flags into a list of valid flags. """ flags = getattr(options, 'flags', None) if flags: flags = [flag.strip().upper() for flag in flags.split(',')] for flag in flags: if flag not in CFG_BIBDOCFILE_AVAILABLE_FLAGS: raise StandardError("%s is not among the valid flags: %s" % (flag, ', '.join(CFG_BIBDOCFILE_AVAILABLE_FLAGS))) return flags return [] def cli2description(options): """Return a good value for the description.""" description = getattr(options, 'set_description', None) if description is None: description = KEEP_OLD_VALUE return description def cli2restriction(options): """Return a good value for the restriction.""" restriction = getattr(options, 'set_restriction', None) if restriction is None: restriction = KEEP_OLD_VALUE return restriction def cli2comment(options): """Return a good value for the comment.""" comment = getattr(options, 'set_comment', None) if comment is None: comment = KEEP_OLD_VALUE return comment def cli2doctype(options): """Return a good value for the doctype.""" doctype = getattr(options, 'set_doctype', None) if not doctype: return 'Main' return doctype def cli2docname(options, docid=None, url=None): """Given the command line options and optional precalculated docid returns the corresponding docname.""" if docid: bibdoc = BibDoc(docid=docid) return bibdoc.get_docname() docname = getattr(options, 'docname', None) if docname is not None: return docname if url is not None: return get_docname_from_url(url) else: return None def cli2format(options, url=None): """Given the command line options returns the corresponding format.""" format = getattr(options, 'format', None) if format is not None: return format elif url is not None: ## FIXME: to deploy once conversion-tools branch is merged #return guess_format_from_url(url) return guess_format_from_url(url) else: raise OptionValueError("Not enough information to retrieve a valid format") def cli_recids_iterator(options, recids=None, docids=None): """Slow iterator over all the matched recids. if with_docids is True, the recid must be attached to at least a matched docid""" debug('cli_recids_iterator') if recids is None: recids = cli_quick_match_all_recids(options) debug('working on recids: %s, docids: %s' % (recids, docids)) for recid in recids: if cli_slow_match_single_recid(options, recid, recids, docids): yield recid raise StopIteration def cli_docids_iterator(options, recids=None, docids=None): """Slow iterator over all the matched docids.""" if recids is None: recids = cli_quick_match_all_recids(options) if docids is None: docids = cli_quick_match_all_docids(options, recids) for docid in docids: if cli_slow_match_single_docid(options, docid, recids, docids): yield docid raise StopIteration +def cli_get_stats(dummy): + """Print per every collection some stats""" + def print_table(title, table): + if table: + print "=" * 20, title, "=" * 20 + for row in table: + print "\t".join(str(elem) for elem in row) + + for collection, reclist in run_sql("SELECT name, reclist FROM collection ORDER BY name"): + print "-" * 79 + print "Statistic for: %s " % collection + reclist = intbitset(reclist) + if reclist: + sqlreclist = "(" + ','.join(str(elem) for elem in reclist) + ')' + print_table("Formats", run_sql("SELECT COUNT(format) as c, format FROM bibrec_bibdoc AS bb JOIN bibdocfsinfo AS fs ON bb.id_bibdoc=fs.id_bibdoc WHERE id_bibrec in %s AND last_version=true GROUP BY format ORDER BY c DESC" % sqlreclist)) # kwalitee: disable=sql + print_table("Mimetypes", run_sql("SELECT COUNT(mime) as c, mime FROM bibrec_bibdoc AS bb JOIN bibdocfsinfo AS fs ON bb.id_bibdoc=fs.id_bibdoc WHERE id_bibrec in %s AND last_version=true GROUP BY mime ORDER BY c DESC" % sqlreclist)) # kwalitee: disable=sql + print_table("Sizes", run_sql("SELECT SUM(filesize) AS c FROM bibrec_bibdoc AS bb JOIN bibdocfsinfo AS fs ON bb.id_bibdoc=fs.id_bibdoc WHERE id_bibrec in %s AND last_version=true" % sqlreclist)) # kwalitee: disable=sql + class OptionParserSpecial(OptionParser): def format_help(self, *args, **kwargs): result = OptionParser.format_help(self, *args, **kwargs) if hasattr(self, 'trailing_text'): return "%s\n%s\n" % (result, self.trailing_text) else: return result def prepare_option_parser(): """Parse the command line options.""" def _ids_ranges_callback(option, opt, value, parser): """Callback for optparse to parse a set of ids ranges in the form nnn1-nnn2,mmm1-mmm2... returning the corresponding intbitset. """ try: debug('option: %s, opt: %s, value: %s, parser: %s' % (option, opt, value, parser)) debug('Parsing range: %s' % value) value = ranges2ids(value) setattr(parser.values, option.dest, value) except Exception, e: raise OptionValueError("It's impossible to parse the range '%s' for option %s: %s" % (value, opt, e)) def _date_range_callback(option, opt, value, parser): """Callback for optparse to parse a range of dates in the form [date1],[date2]. Both date1 and date2 could be optional. the date can be expressed absolutely ("%Y-%m-%d %H:%M:%S") or relatively (([-\+]{0,1})([\d]+)([dhms])) to the current time.""" try: value = _parse_date_range(value) setattr(parser.values, option.dest, value) except Exception, e: raise OptionValueError("It's impossible to parse the range '%s' for option %s: %s" % (value, opt, e)) parser = OptionParserSpecial(usage="usage: %prog [options]", #epilog="""With you select the range of record/docnames/single files to work on. Note that some actions e.g. delete, append, revise etc. works at the docname level, while others like --set-comment, --set-description, at single file level and other can be applied in an iterative way to many records in a single run. Note that specifing docid(2) takes precedence over recid(2) which in turns takes precedence over pattern/collection search.""", version=__revision__) parser.trailing_text = """ Examples: $ bibdocfile --append foo.tar.gz --recid=1 $ bibdocfile --revise http://foo.com?search=123 --with-docname='sam' --format=pdf --recid=3 --set-docname='pippo' # revise for record 3 # the document sam, renaming it to pippo. $ bibdocfile --delete --with-docname="*sam" --all # delete all documents # starting ending # with "sam" $ bibdocfile --undelete -c "Test Collection" # undelete documents for # the collection $ bibdocfile --get-info --recids=1-4,6-8 # obtain informations $ bibdocfile -r 1 --with-docname=foo --set-docname=bar # Rename a document $ bibdocfile -r 1 --set-restriction "firerole: deny until '2011-01-01' allow any" # set an embargo to all the documents attached to record 1 # (note the ^M or \\n before 'allow any') # See also $r subfield in <%(site)s/help/admin/bibupload-admin-guide#3.6> # and Firerole in <%(site)s/help/admin/webaccess-admin-guide#6> $ bibdocfile --append x.pdf --recid=1 --with-flags='PDF/A,OCRED' # append # to record 1 the file x.pdf specifying the PDF/A and OCRED flags """ % {'site': CFG_SITE_URL} query_options = OptionGroup(parser, 'Query options') query_options.add_option('-r', '--recids', action="callback", callback=_ids_ranges_callback, type='string', dest='recids', help='matches records by recids, e.g.: --recids=1-3,5-7') query_options.add_option('-d', '--docids', action="callback", callback=_ids_ranges_callback, type='string', dest='docids', help='matches documents by docids, e.g.: --docids=1-3,5-7') query_options.add_option('-a', '--all', action='store_true', dest='all', help='Select all the records') query_options.add_option("--with-deleted-recs", choices=['yes', 'no', 'only'], type="choice", dest="deleted_recs", help="'Yes' to also match deleted records, 'no' to exclude them, 'only' to match only deleted ones", metavar="yes/no/only", default='no') query_options.add_option("--with-deleted-docs", choices=['yes', 'no', 'only'], type="choice", dest="deleted_docs", help="'Yes' to also match deleted documents, 'no' to exclude them, 'only' to match only deleted ones (e.g. for undeletion)", metavar="yes/no/only", default='no') query_options.add_option("--with-empty-recs", choices=['yes', 'no', 'only'], type="choice", dest="empty_recs", help="'Yes' to also match records without attached documents, 'no' to exclude them, 'only' to consider only such records (e.g. for statistics)", metavar="yes/no/only", default='no') query_options.add_option("--with-empty-docs", choices=['yes', 'no', 'only'], type="choice", dest="empty_docs", help="'Yes' to also match documents without attached files, 'no' to exclude them, 'only' to consider only such documents (e.g. for sanity checking)", metavar="yes/no/only", default='no') query_options.add_option("--with-record-modification-date", action="callback", callback=_date_range_callback, dest="md_rec", nargs=1, type="string", default=(None, None), help="matches records modified date1 and date2; dates can be expressed relatively, e.g.:\"-5m,2030-2-23 04:40\" # matches records modified since 5 minutes ago until the 2030...", metavar="date1,date2") query_options.add_option("--with-record-creation-date", action="callback", callback=_date_range_callback, dest="cd_rec", nargs=1, type="string", default=(None, None), help="matches records created between date1 and date2; dates can be expressed relatively", metavar="date1,date2") query_options.add_option("--with-document-modification-date", action="callback", callback=_date_range_callback, dest="md_doc", nargs=1, type="string", default=(None, None), help="matches documents modified between date1 and date2; dates can be expressed relatively", metavar="date1,date2") query_options.add_option("--with-document-creation-date", action="callback", callback=_date_range_callback, dest="cd_doc", nargs=1, type="string", default=(None, None), help="matches documents created between date1 and date2; dates can be expressed relatively", metavar="date1,date2") query_options.add_option("--url", dest="url", help='matches the document referred by the URL, e.g. "%s/%s/1/files/foobar.pdf?version=2"' % (CFG_SITE_URL, CFG_SITE_RECORD)) query_options.add_option("--path", dest="path", help='matches the document referred by the internal filesystem path, e.g. %s/g0/1/foobar.pdf\\;1' % CFG_WEBSUBMIT_FILEDIR) query_options.add_option("--with-docname", dest="docname", help='matches documents with the given docname (accept wildcards)') query_options.add_option("--with-doctype", dest="doctype", help='matches documents with the given doctype') query_options.add_option('-p', '--pattern', dest='pattern', help='matches records by pattern') query_options.add_option('-c', '--collection', dest='collection', help='matches records by collection') query_options.add_option('--force', dest='force', help='force an action even when it\'s not necessary e.g. textify on an already textified bibdoc.', action='store_true', default=False) parser.add_option_group(query_options) getting_information_options = OptionGroup(parser, 'Actions for getting information') getting_information_options.add_option('--get-info', dest='action', action='store_const', const='get-info', help='print all the informations about the matched record/documents') getting_information_options.add_option('--get-disk-usage', dest='action', action='store_const', const='get-disk-usage', help='print disk usage statistics of the matched documents') getting_information_options.add_option('--get-history', dest='action', action='store_const', const='get-history', help='print the matched documents history') + getting_information_options.add_option('--get-stats', dest='action', action='store_const', const='get-stats', help='print some statistics of file properties grouped by collections') parser.add_option_group(getting_information_options) setting_information_options = OptionGroup(parser, 'Actions for setting information') setting_information_options.add_option('--set-doctype', dest='set_doctype', help='specify the new doctype', metavar='doctype') setting_information_options.add_option('--set-description', dest='set_description', help='specify a description', metavar='description') setting_information_options.add_option('--set-comment', dest='set_comment', help='specify a comment', metavar='comment') setting_information_options.add_option('--set-restriction', dest='set_restriction', help='specify a restriction tag', metavar='restriction') setting_information_options.add_option('--set-docname', dest='new_docname', help='specifies a new docname for renaming', metavar='docname') setting_information_options.add_option("--unset-comment", action="store_const", const='', dest="set_comment", help="remove any comment") setting_information_options.add_option("--unset-descriptions", action="store_const", const='', dest="set_description", help="remove any description") setting_information_options.add_option("--unset-restrictions", action="store_const", const='', dest="set_restriction", help="remove any restriction") setting_information_options.add_option("--hide", dest="action", action='store_const', const='hide', help="hides matched documents and revisions") setting_information_options.add_option("--unhide", dest="action", action='store_const', const='unhide', help="hides matched documents and revisions") parser.add_option_group(setting_information_options) revising_options = OptionGroup(parser, 'Action for revising content') revising_options.add_option("--append", dest='append_path', help='specify the URL/path of the file that will appended to the bibdoc (implies --with-empty-recs=yes)', metavar='PATH/URL') revising_options.add_option("--revise", dest='revise_path', help='specify the URL/path of the file that will revise the bibdoc', metavar='PATH/URL') revising_options.add_option("--revert", dest='action', action='store_const', const='revert', help='reverts a document to the specified version') revising_options.add_option("--delete", action='store_const', const='delete', dest='action', help='soft-delete the matched documents') revising_options.add_option("--hard-delete", action='store_const', const='hard-delete', dest='action', help='hard-delete the single matched document with a specific format and a specific revision (this operation is not revertible)') revising_options.add_option("--undelete", action='store_const', const='undelete', dest='action', help='undelete previosuly soft-deleted documents') revising_options.add_option("--purge", action='store_const', const='purge', dest='action', help='purge (i.e. hard-delete any format of any version prior to the latest version of) the matched documents') revising_options.add_option("--expunge", action='store_const', const='expunge', dest='action', help='expunge (i.e. hard-delete any version and formats of) the matched documents') revising_options.add_option("--with-version", dest="version", help="specifies the version(s) to be used with hide, unhide, e.g.: 1-2,3 or ALL. Specifies the version to be used with hard-delete and revert, e.g. 2") revising_options.add_option("--with-format", dest="format", help='to specify a format when appending/revising/deleting/reverting a document, e.g. "pdf"', metavar='FORMAT') revising_options.add_option("--with-hide-previous", dest='hide_previous', action='store_true', help='when revising, hides previous versions', default=False) revising_options.add_option("--with-flags", dest='flags', help='comma-separated optional list of flags used when appending/revising a document. Valid flags are: %s' % ', '.join(CFG_BIBDOCFILE_AVAILABLE_FLAGS), default=None) parser.add_option_group(revising_options) housekeeping_options = OptionGroup(parser, 'Actions for housekeeping') housekeeping_options.add_option("--check-md5", action='store_const', const='check-md5', dest='action', help='check md5 checksum validity of files') housekeeping_options.add_option("--check-format", action='store_const', const='check-format', dest='action', help='check if any format-related inconsistences exists') housekeeping_options.add_option("--check-duplicate-docnames", action='store_const', const='check-duplicate-docnames', dest='action', help='check for duplicate docnames associated with the same record') housekeeping_options.add_option("--update-md5", action='store_const', const='update-md5', dest='action', help='update md5 checksum of files') housekeeping_options.add_option("--fix-all", action='store_const', const='fix-all', dest='action', help='fix inconsistences in filesystem vs database vs MARC') housekeeping_options.add_option("--fix-marc", action='store_const', const='fix-marc', dest='action', help='synchronize MARC after filesystem/database') housekeeping_options.add_option("--fix-format", action='store_const', const='fix-format', dest='action', help='fix format related inconsistences') housekeeping_options.add_option("--fix-duplicate-docnames", action='store_const', const='fix-duplicate-docnames', dest='action', help='fix duplicate docnames associated with the same record') + housekeeping_options.add_option("--fix-bibdocfsinfo-cache", action='store_const', const='fix-bibdocfsinfo-cache', dest='action', help='fix bibdocfsinfo cache related inconsistences') parser.add_option_group(housekeeping_options) experimental_options = OptionGroup(parser, 'Experimental options (do not expect to find them in the next release)') experimental_options.add_option('--textify', dest='action', action='store_const', const='textify', help='extract text from matched documents and store it for later indexing') experimental_options.add_option('--with-ocr', dest='perform_ocr', action='store_true', default=False, help='when used with --textify, wether to perform OCR') parser.add_option_group(experimental_options) parser.add_option('-D', '--debug', action='store_true', dest='debug', default=False) parser.add_option('-H', '--human-readable', dest='human_readable', action='store_true', default=False, help='print sizes in human readable format (e.g., 1KB 234MB 2GB)') parser.add_option('--yes-i-know', action='store_true', dest='yes-i-know', help='use with care!') return parser def print_info(recid, docid, info): """Nicely print info about a recid, docid pair.""" print '%i:%i:%s' % (recid, docid, info) def bibupload_ffts(ffts, append=False, debug=False, interactive=True): """Given an ffts dictionary it creates the xml and submit it.""" xml = ffts_to_xml(ffts) if xml: if interactive: print xml tmp_file_fd, tmp_file_name = mkstemp(suffix='.xml', prefix="bibdocfile_%s" % time.strftime("%Y-%m-%d_%H:%M:%S"), dir=CFG_TMPSHAREDDIR) os.write(tmp_file_fd, xml) os.close(tmp_file_fd) os.chmod(tmp_file_name, 0644) if append: if interactive: wait_for_user("This will be appended via BibUpload") if debug: task = task_low_level_submission('bibupload', 'bibdocfile', '-a', tmp_file_name, '-N', 'FFT', '-S2', '-v9') else: task = task_low_level_submission('bibupload', 'bibdocfile', '-a', tmp_file_name, '-N', 'FFT', '-S2') if interactive: print "BibUpload append submitted with id %s" % task else: if interactive: wait_for_user("This will be corrected via BibUpload") if debug: task = task_low_level_submission('bibupload', 'bibdocfile', '-c', tmp_file_name, '-N', 'FFT', '-S2', '-v9') else: task = task_low_level_submission('bibupload', 'bibdocfile', '-c', tmp_file_name, '-N', 'FFT', '-S2') if interactive: print "BibUpload correct submitted with id %s" % task elif interactive: print >> sys.stderr, "WARNING: no MARC to upload." return True def ranges2ids(parse_string): """Parse a string and return the intbitset of the corresponding ids.""" ids = intbitset() ranges = parse_string.split(",") for arange in ranges: tmp_ids = arange.split("-") if len(tmp_ids)==1: ids.add(int(tmp_ids[0])) else: if int(tmp_ids[0]) > int(tmp_ids[1]): # sanity check tmp = tmp_ids[0] tmp_ids[0] = tmp_ids[1] tmp_ids[1] = tmp ids += xrange(int(tmp_ids[0]), int(tmp_ids[1]) + 1) return ids def cli_append(options, append_path): """Create a bibupload FFT task submission for appending a format.""" recid = cli2recid(options) comment = cli2comment(options) description = cli2description(options) restriction = cli2restriction(options) doctype = cli2doctype(options) docname = cli2docname(options, url=append_path) flags = cli2flags(options) if not docname: raise OptionValueError, 'Not enough information to retrieve a valid docname' format = cli2format(options, append_path) url = clean_url(append_path) check_valid_url(url) bibrecdocs = BibRecDocs(recid) if bibrecdocs.has_docname_p(docname) and bibrecdocs.get_bibdoc(docname).format_already_exists_p(format): new_docname = bibrecdocs.propose_unique_docname(docname) wait_for_user("WARNING: a document with name %s and format %s already exists for recid %s. A new document with name %s will be created instead." % (repr(docname), repr(format), repr(recid), repr(new_docname))) docname = new_docname ffts = {recid: [{ 'docname' : docname, 'comment' : comment, 'description' : description, 'restriction' : restriction, 'doctype' : doctype, 'format' : format, 'url' : url, 'options': flags }]} return bibupload_ffts(ffts, append=True) def cli_revise(options, revise_path): """Create a bibupload FFT task submission for appending a format.""" recid = cli2recid(options) comment = cli2comment(options) description = cli2description(options) restriction = cli2restriction(options) docname = cli2docname(options, url=revise_path) hide_previous = getattr(options, 'hide_previous', None) flags = cli2flags(options) if hide_previous and 'PERFORM_HIDE_PREVIOUS' not in flags: flags.append('PERFORM_HIDE_PREVIOUS') if not docname: raise OptionValueError, 'Not enough information to retrieve a valid docname' format = cli2format(options, revise_path) doctype = cli2doctype(options) url = clean_url(revise_path) new_docname = getattr(options, 'new_docname', None) check_valid_url(url) ffts = {recid : [{ 'docname' : docname, 'new_docname' : new_docname, 'comment' : comment, 'description' : description, 'restriction' : restriction, 'doctype' : doctype, 'format' : format, 'url' : url, 'options' : flags }]} return bibupload_ffts(ffts) def cli_set_batch(options): """Change in batch the doctype, description, comment and restriction.""" ffts = {} doctype = getattr(options, 'set_doctype', None) description = cli2description(options) comment = cli2comment(options) restriction = cli2restriction(options) with_format = getattr(options, 'format', None) for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) recid = bibdoc.get_recid() docname = bibdoc.get_docname() fft = [] if description is not None or comment is not None: for bibdocfile in bibdoc.list_latest_files(): format = bibdocfile.get_format() if not with_format or with_format == format: fft.append({ 'docname': docname, 'restriction': restriction, 'comment': comment, 'description': description, 'format': format, 'doctype': doctype }) else: fft.append({ 'docname': docname, 'restriction': restriction, 'doctype': doctype, }) ffts[recid] = fft return bibupload_ffts(ffts, append=False) def cli_textify(options): """Extract text to let indexing on fulltext be possible.""" force = getattr(options, 'force', None) perform_ocr = getattr(options, 'perform_ocr', None) if perform_ocr: if not can_perform_ocr(): print >> sys.stderr, "WARNING: OCR requested but OCR is not possible" perform_ocr = False if perform_ocr: additional = ' using OCR (this might take some time)' else: additional = '' for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) print 'Extracting text for docid %s%s...' % (docid, additional), sys.stdout.flush() if force or not bibdoc.has_text(require_up_to_date=True): try: bibdoc.extract_text(perform_ocr=perform_ocr) print "DONE" except InvenioWebSubmitFileError, e: print >> sys.stderr, "WARNING: %s" % e else: print "not needed" def cli_rename(options): """Rename a docname within a recid.""" new_docname = getattr(options, 'new_docname', None) docid = cli2docid(options) bibdoc = BibDoc(docid) docname = bibdoc.get_docname() recid = bibdoc.get_recid() ffts = {recid : [{'docname' : docname, 'new_docname' : new_docname}]} return bibupload_ffts(ffts, append=False) +def cli_fix_bibdocfsinfo_cache(options): + """Rebuild the bibdocfsinfo table according to what is available on filesystem""" + to_be_fixed = intbitset() + for docid in intbitset(run_sql("SELECT id FROM bibdoc")): + print "Fixing bibdocfsinfo table for docid %s..." % docid, + sys.stdout.flush() + try: + bibdoc = BibDoc(docid) + except InvenioWebSubmitFileError, err: + print err + continue + try: + bibdoc._sync_to_db() + except Exception, err: + recid = bibdoc.recid + if recid: + to_be_fixed.add(recid) + print "ERROR: %s, scheduling a fix for recid %s" % (err, recid) + print "DONE" + if to_be_fixed: + cli_fix_format(options, recids=to_be_fixed) + print "You can now add CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE=1 to your invenio-local.conf file." + def cli_fix_all(options): """Fix all the records of a recid_set.""" ffts = {} for recid in cli_recids_iterator(options): ffts[recid] = [] for docname in BibRecDocs(recid).get_bibdoc_names(): ffts[recid].append({'docname' : docname, 'doctype' : 'FIX-ALL'}) return bibupload_ffts(ffts, append=False) def cli_fix_marc(options, explicit_recid_set=None, interactive=True): """Fix all the records of a recid_set.""" ffts = {} if explicit_recid_set is not None: for recid in explicit_recid_set: ffts[recid] = [{'doctype' : 'FIX-MARC'}] else: for recid in cli_recids_iterator(options): ffts[recid] = [{'doctype' : 'FIX-MARC'}] return bibupload_ffts(ffts, append=False, interactive=interactive) def cli_check_format(options): """Check if any format-related inconsistences exists.""" count = 0 tot = 0 duplicate = False for recid in cli_recids_iterator(options): tot += 1 bibrecdocs = BibRecDocs(recid) if not bibrecdocs.check_duplicate_docnames(): print >> sys.stderr, "recid %s has duplicate docnames!" broken = True duplicate = True else: broken = False for docname in bibrecdocs.get_bibdoc_names(): if not bibrecdocs.check_format(docname): print >> sys.stderr, "recid %s with docname %s need format fixing" % (recid, docname) broken = True if broken: count += 1 if count: result = "%d out of %d records need their formats to be fixed." % (count, tot) else: result = "All records appear to be correct with respect to formats." if duplicate: result += " Note however that at least one record appear to have duplicate docnames. You should better fix this situation by using --fix-duplicate-docnames." print wrap_text_in_a_box(result, style="conclusion") return not(duplicate or count) def cli_check_duplicate_docnames(options): """Check if some record is connected with bibdoc having the same docnames.""" count = 0 tot = 0 for recid in cli_recids_iterator(options): tot += 1 bibrecdocs = BibRecDocs(recid) if bibrecdocs.check_duplicate_docnames(): count += 1 - print sys.stderr, "recid %s has duplicate docnames!" + print >> sys.stderr, "recid %s has duplicate docnames!" if count: print "%d out of %d records have duplicate docnames." % (count, tot) return False else: print "All records appear to be correct with respect to duplicate docnames." return True -def cli_fix_format(options): +def cli_fix_format(options, recids=None): """Fix format-related inconsistences.""" fixed = intbitset() tot = 0 - for recid in cli_recids_iterator(options): + if not recids: + recids = cli_recids_iterator(options) + for recid in recids: tot += 1 bibrecdocs = BibRecDocs(recid) for docname in bibrecdocs.get_bibdoc_names(): if not bibrecdocs.check_format(docname): if bibrecdocs.fix_format(docname, skip_check=True): print >> sys.stderr, "%i has been fixed for docname %s" % (recid, docname) else: print >> sys.stderr, "%i has been fixed for docname %s. However note that a new bibdoc might have been created." % (recid, docname) fixed.add(recid) if fixed: print "Now we need to synchronize MARC to reflect current changes." cli_fix_marc(options, explicit_recid_set=fixed) print wrap_text_in_a_box("%i out of %i record needed to be fixed." % (tot, len(fixed)), style="conclusion") return not fixed def cli_fix_duplicate_docnames(options): """Fix duplicate docnames.""" fixed = intbitset() tot = 0 for recid in cli_recids_iterator(options): tot += 1 bibrecdocs = BibRecDocs(recid) if not bibrecdocs.check_duplicate_docnames(): bibrecdocs.fix_duplicate_docnames(skip_check=True) print >> sys.stderr, "%i has been fixed for duplicate docnames." % recid fixed.add(recid) if fixed: print "Now we need to synchronize MARC to reflect current changes." cli_fix_marc(options, explicit_recid_set=fixed) print wrap_text_in_a_box("%i out of %i record needed to be fixed." % (tot, len(fixed)), style="conclusion") return not fixed def cli_delete(options): """Delete the given docid_set.""" ffts = {} for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) docname = bibdoc.get_docname() recid = bibdoc.get_recid() if recid not in ffts: ffts[recid] = [{'docname' : docname, 'doctype' : 'DELETE'}] else: ffts[recid].append({'docname' : docname, 'doctype' : 'DELETE'}) return bibupload_ffts(ffts) def cli_delete_file(options): """Delete the given file irreversibely.""" docid = cli2docid(options) recid = cli2recid(options, docids=intbitset([docid])) format = cli2format(options) docname = BibDoc(docid).get_docname() version = getattr(options, 'version', None) try: version_int = int(version) if 0 >= version_int: raise ValueError except: raise OptionValueError, 'when hard-deleting, version should be valid positive integer, not %s' % version ffts = {recid : [{'docname' : docname, 'version' : version, 'format' : format, 'doctype' : 'DELETE-FILE'}]} return bibupload_ffts(ffts) def cli_revert(options): """Revert a bibdoc to a given version.""" docid = cli2docid(options) recid = cli2recid(options, docids=intbitset([docid])) docname = BibDoc(docid).get_docname() version = getattr(options, 'version', None) try: version_int = int(version) if 0 >= version_int: raise ValueError except: raise OptionValueError, 'when reverting, version should be valid positive integer, not %s' % version ffts = {recid : [{'docname' : docname, 'version' : version, 'doctype' : 'REVERT'}]} return bibupload_ffts(ffts) def cli_undelete(options): """Delete the given docname""" docname = cli2docname(options) restriction = getattr(options, 'restriction', None) count = 0 if not docname: docname = 'DELETED-*-*' if not docname.startswith('DELETED-'): docname = 'DELETED-*-' + docname to_be_undeleted = intbitset() fix_marc = intbitset() setattr(options, 'deleted_docs', 'only') for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) if bibdoc.get_status() == 'DELETED' and fnmatch.fnmatch(bibdoc.get_docname(), docname): to_be_undeleted.add(docid) fix_marc.add(bibdoc.get_recid()) count += 1 print '%s (docid %s from recid %s) will be undeleted to restriction: %s' % (bibdoc.get_docname(), docid, bibdoc.get_recid(), restriction) wait_for_user("I'll proceed with the undeletion") for docid in to_be_undeleted: bibdoc = BibDoc(docid) bibdoc.undelete(restriction) cli_fix_marc(options, explicit_recid_set=fix_marc) print wrap_text_in_a_box("%s bibdoc successfuly undeleted with status '%s'" % (count, restriction), style="conclusion") def cli_get_info(options): """Print all the info of the matched docids or recids.""" debug('Getting info!') human_readable = bool(getattr(options, 'human_readable', None)) debug('human_readable: %s' % human_readable) deleted_docs = getattr(options, 'deleted_docs', None) in ('yes', 'only') debug('deleted_docs: %s' % deleted_docs) if getattr(options, 'docids', None): for docid in cli_docids_iterator(options): sys.stdout.write(str(BibDoc(docid, human_readable=human_readable))) else: for recid in cli_recids_iterator(options): sys.stdout.write(str(BibRecDocs(recid, deleted_too=deleted_docs, human_readable=human_readable))) def cli_purge(options): """Purge the matched docids.""" ffts = {} for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) recid = bibdoc.get_recid() docname = bibdoc.get_docname() if recid: if recid not in ffts: ffts[recid] = [] ffts[recid].append({ 'docname' : docname, 'doctype' : 'PURGE', }) return bibupload_ffts(ffts) def cli_expunge(options): """Expunge the matched docids.""" ffts = {} for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) recid = bibdoc.get_recid() docname = bibdoc.get_docname() if recid: if recid not in ffts: ffts[recid] = [] ffts[recid].append({ 'docname' : docname, 'doctype' : 'EXPUNGE', }) return bibupload_ffts(ffts) def cli_get_history(options): """Print the history of a docid_set.""" for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) history = bibdoc.get_history() for row in history: print_info(bibdoc.get_recid(), docid, row) def cli_get_disk_usage(options): """Print the space usage of a docid_set.""" human_readable = getattr(options, 'human_readable', None) total_size = 0 total_latest_size = 0 for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) size = bibdoc.get_total_size() total_size += size latest_size = bibdoc.get_total_size_latest_version() total_latest_size += latest_size if human_readable: print_info(bibdoc.get_recid(), docid, 'size=%s' % nice_size(size)) print_info(bibdoc.get_recid(), docid, 'latest version size=%s' % nice_size(latest_size)) else: print_info(bibdoc.get_recid(), docid, 'size=%s' % size) print_info(bibdoc.get_recid(), docid, 'latest version size=%s' % latest_size) if human_readable: print wrap_text_in_a_box('total size: %s\n\nlatest version total size: %s' % (nice_size(total_size), nice_size(total_latest_size)), style='conclusion') else: print wrap_text_in_a_box('total size: %s\n\nlatest version total size: %s' % (total_size, total_latest_size), style='conclusion') def cli_check_md5(options): """Check the md5 sums of a docid_set.""" failures = 0 for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) if bibdoc.md5s.check(): print_info(bibdoc.get_recid(), docid, 'checksum OK') else: for afile in bibdoc.list_all_files(): if not afile.check(): failures += 1 print_info(bibdoc.get_recid(), docid, '%s failing checksum!' % afile.get_full_path()) if failures: print wrap_text_in_a_box('%i files failing' % failures , style='conclusion') else: print wrap_text_in_a_box('All files are correct', style='conclusion') def cli_update_md5(options): """Update the md5 sums of a docid_set.""" for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) if bibdoc.md5s.check(): print_info(bibdoc.get_recid(), docid, 'checksum OK') else: for afile in bibdoc.list_all_files(): if not afile.check(): print_info(bibdoc.get_recid(), docid, '%s failing checksum!' % afile.get_full_path()) wait_for_user('Updating the md5s of this document can hide real problems.') bibdoc.md5s.update(only_new=False) def cli_hide(options): """Hide the matched versions of documents.""" documents_to_be_hidden = {} to_be_fixed = intbitset() versions = getattr(options, 'versions', 'all') if versions != 'all': try: versions = ranges2ids(versions) except: raise OptionValueError, 'You should specify correct versions. Not %s' % versions else: versions = intbitset(trailing_bits=True) for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) recid = bibdoc.get_recid() if recid: for bibdocfile in bibdoc.list_all_files(): this_version = bibdocfile.get_version() this_format = bibdocfile.get_format() if this_version in versions: if docid not in documents_to_be_hidden: documents_to_be_hidden[docid] = [] documents_to_be_hidden[docid].append((this_version, this_format)) to_be_fixed.add(recid) print '%s (docid: %s, recid: %s) will be hidden' % (bibdocfile.get_full_name(), docid, recid) wait_for_user('Proceeding to hide the matched documents...') for docid, documents in documents_to_be_hidden.iteritems(): bibdoc = BibDoc(docid) for version, format in documents: bibdoc.set_flag('HIDDEN', format, version) return cli_fix_marc(options, to_be_fixed) def cli_unhide(options): """Unhide the matched versions of documents.""" documents_to_be_unhidden = {} to_be_fixed = intbitset() versions = getattr(options, 'versions', 'all') if versions != 'all': try: versions = ranges2ids(versions) except: raise OptionValueError, 'You should specify correct versions. Not %s' % versions else: versions = intbitset(trailing_bits=True) for docid in cli_docids_iterator(options): bibdoc = BibDoc(docid) recid = bibdoc.get_recid() if recid: for bibdocfile in bibdoc.list_all_files(): this_version = bibdocfile.get_version() this_format = bibdocfile.get_format() if this_version in versions: if docid not in documents_to_be_unhidden: documents_to_be_unhidden[docid] = [] documents_to_be_unhidden[docid].append((this_version, this_format)) to_be_fixed.add(recid) print '%s (docid: %s, recid: %s) will be unhidden' % (bibdocfile.get_full_name(), docid, recid) wait_for_user('Proceeding to unhide the matched documents...') for docid, documents in documents_to_be_unhidden.iteritems(): bibdoc = BibDoc(docid) for version, format in documents: bibdoc.unset_flag('HIDDEN', format, version) return cli_fix_marc(options, to_be_fixed) def main(): parser = prepare_option_parser() (options, args) = parser.parse_args() if getattr(options, 'debug', None): getLogger().setLevel(DEBUG) debug('test') debug('options: %s, args: %s' % (options, args)) try: if not getattr(options, 'action', None) and \ not getattr(options, 'append_path', None) and \ not getattr(options, 'revise_path', None): if getattr(options, 'set_doctype', None) is not None or \ getattr(options, 'set_comment', None) is not None or \ getattr(options, 'set_description', None) is not None or \ getattr(options, 'set_restriction', None) is not None: cli_set_batch(options) elif getattr(options, 'new_docname', None): cli_rename(options) else: print >> sys.stderr, "ERROR: no action specified" sys.exit(1) elif getattr(options, 'append_path', None): options.empty_recs = 'yes' options.empty_docs = 'yes' cli_append(options, getattr(options, 'append_path', None)) elif getattr(options, 'revise_path', None): cli_revise(options, getattr(options, 'revise_path', None)) elif options.action == 'textify': cli_textify(options) elif getattr(options, 'action', None) == 'get-history': cli_get_history(options) elif getattr(options, 'action', None) == 'get-info': cli_get_info(options) elif getattr(options, 'action', None) == 'get-disk-usage': cli_get_disk_usage(options) elif getattr(options, 'action', None) == 'check-md5': cli_check_md5(options) elif getattr(options, 'action', None) == 'update-md5': cli_update_md5(options) elif getattr(options, 'action', None) == 'fix-all': cli_fix_all(options) elif getattr(options, 'action', None) == 'fix-marc': cli_fix_marc(options) elif getattr(options, 'action', None) == 'delete': cli_delete(options) elif getattr(options, 'action', None) == 'hard-delete': cli_delete_file(options) elif getattr(options, 'action', None) == 'fix-duplicate-docnames': cli_fix_duplicate_docnames(options) elif getattr(options, 'action', None) == 'fix-format': cli_fix_format(options) elif getattr(options, 'action', None) == 'check-duplicate-docnames': cli_check_duplicate_docnames(options) elif getattr(options, 'action', None) == 'check-format': cli_check_format(options) elif getattr(options, 'action', None) == 'undelete': cli_undelete(options) elif getattr(options, 'action', None) == 'purge': cli_purge(options) elif getattr(options, 'action', None) == 'expunge': cli_expunge(options) elif getattr(options, 'action', None) == 'revert': cli_revert(options) elif getattr(options, 'action', None) == 'hide': cli_hide(options) elif getattr(options, 'action', None) == 'unhide': cli_unhide(options) + elif getattr(options, 'action', None) == 'fix-bibdocfsinfo-cache': + options.empty_docs = 'yes' + cli_fix_bibdocfsinfo_cache(options) + elif getattr(options, 'action', None) == 'get-stats': + cli_get_stats(options) else: print >> sys.stderr, "ERROR: Action %s is not valid" % getattr(options, 'action', None) sys.exit(1) except Exception, e: register_exception() print >> sys.stderr, 'ERROR: %s' % e sys.exit(1) if __name__ == '__main__': main()